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Abstract 

It  is  an  understatement  that  both  the  theory  and  applications  of  probability  -conditional  or 
unconditional  -  play  an  essential  role  in  the  processing  and  use  of  disparate  information  in 
decision-making  in  C4I  systems.  Apropos  to  the  theme  of  this  symposium,  “Making  Information 
Superiority  Happen”,  the  paper  outlined  here  describes  new  applications,  insights,  and  theoretical 
aspects  of  ongoing  work  by  the  authors  toward  improving  the  rationale  for  use  of  probability 
theory,  keeping  in  mind  issues  of  scalability  and  computational  complexity.  This  paper  extends 
the  ideas  first  presented  in  last  year’s  CCRTS  at  Newport,  RI.  In  short,  the  mathematical  theme 
of  this  paper  is  both  a  summary  of  past  research  efforts  together  with  new  results  on  the  problem 
of  best  estimating  partially  specified  conditional  and  unconditional  probabilities  of  interest  via  a 
second  order  bayesian  probability  approach.  Among  the  new  derivations  provided  in  this  paper 
is  a  significant  reduction  in  computational  effort  in  obtaining  (again,  in  the  second  order 
probability  sense)  optimal  or  “near-optimal”  probability  estimates,  all  within  the  setting  of  a 
boolean  “conditional  event  algebra”  which  allows  full  compatibility  with  conditional  probability 
evaluations. 

1.  Introduction 

As  stated  in  the  abstract,  this  work  is  a  direct  continuation  of  the  effort  presented  in  [Goodman, 
1999].  Even  in  the  simplest  appearing  situation,  where  probabilistic  information  is  present  in  the 
form  of  specified  or  apriori  known  (or  estimated)  probabilities  of  certain  contributing  events,  the 
theory  of  how  to  determine  or  best  estimate  the  probability  of  another  particular  event,  or  events 
of  interest,  may  not  be  readily  apparent.  In  addition,  it  is  possible  that  this  problem  cannot  be 
resolved  within  the  confines  of  ordinary  probability  theory  because  at  times  it  seems  to  be  at 
odds  with  our  “commonsense”  solution.  This  phenomenon  is  seen  to  occur  even  at  the  simplest 
levels,  as  will  be  illustrated  later.  Since  a  basic  aspect  of  reasoning  relative  to  Command  & 
Control  relies  heavily  upon  probability  concepts,  these  issues  must  be  resolved  within  a 
framework  of  rigor,  yet  computational  tractability.  Such  real-world  probabilities  are  often  not 
necessarily  even  fully  theoretically  determined.  This  typically  occurs  in  rule-based  systems 
where  all  that  is  known  concerning  conditional  probabilities  associated  with  rules  are  lower 
bounds  on  those  probabilities.  In  addition,  many  popular  techniques,  such  as  Bayes  nets  (see, 
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e.g.,  [Pearl,  1988]  for  a  basic  exposition),  may  not  be  applicable,  unless  one  is  willing  to  make  a 
number  of  independence  assumptions  -  assumptions  for  which  there  may  not  always  be  good 
justification.  This  paper  will  address  a  number  of  issues  connected  with  the  estimation  of 
underspecified  probabilities. 

1.1  Notation,  Conventions,  Fundamental  Results 

Recalling  the  basic  identification  between  classical  propositional  logic  (actually  also  including 
its  extension  to  quantified  logic)  and  boolean  (and  sigma)  algebra  of  events  or  sets,  as  given 
through  the  Stone  representation  theorem  provided  in  details,  e.g.,  in  Chapter  5  of  [Mendelson, 
1970],  we  choose  throughout  this  paper  to  present  all  concepts  via  a  boolean  algebra  framework 
and  its  ramifications.  Here,  letters  a,  b,  c,...  indicate  events  (or  statements  which  can  be  true  or 
false)  in  a  boolean  algebra  B ,  where:  all  disjunctions  or  unions  of  events  in  B  are  indicated  by  v 
(where  any  finite  quantity  of  such  disjunctions  produces  another  event  in  5);  all  complements  or 
negations  of  events  by  (.)',  as,  e.g.,  a'  (also  an  event  in  5);  and  all  conjunctions  or  intersections  of 
events  by  either  &,  as  e.g.,  a&b,  or  when  possible,  by  simply  omitting  any  symbol  such  as  ab. 
Repeated  conjunction  or  disjunction  operations  are  indicated  in  the  usual  way  in  string  form  or 
with  use  of  an  index  set  below  the  symbols  &  or  v.  At  times,  capital  letters  in  roman  form  A, 
B,...,  or  in  italic  form  A,  £,...,  as  well  as  lower  case  greek  letters  a,  P,  y,...,  will  be  used  to 
indicate  special  events  or  special  collections  of  events.  The  universal  event  or  set  containing  all 
events  of  relevance  to  the  problem  at  hand  is  indicated  by  Q,,  while  the  null  event  or  set  is 
indicated  by  0.  The  standard  subevent  partial  order  relation  is  denoted  as  c  <  d.  Equality  of 
events  is  simply  denoted  by  =,  etc.  The  triple  (fU3,P)  refers  to  a  probability  space  with 
probability  measure  P:B— >[0,1]  (unit  interval).  Here,  generally,  B  is  a  boolean  algebra,  but  if 
needed,  B  may  also  be  a  sigma  algebra  (where  all  countable  infinite  repetitions  of  &  and  v  on 
events  in  B  produce  again  events  in  B).  When  necessary  to  distinguish  between  the  boolean 
algebra  operators  acting  on  events  and  logical  operators  acting  upon  sets  of  events  or  upon  index 
sets  relative  to  the  events,  we  use  the  set  notation  n,  u,  e,  rather  than  the  corresponding  &,  v, 
<,  etc.  When  needed  to  emphasize  a  point,  we  use  the  convention  =d  to  indicate  a  definition, 
rather  than  a  proved  result  and  =w  to  indicate  “which  equals”.  The  notation  card(J)  refers  to  the 
cardinality  of  (usually,  index)  set  J. 

Metalogical  notation  -  i.e.,  notation  utilized  in  proving  theorems  or  making  remarks  about 
boolean  algebras,  probabilities,  etc.  involved  in  the  results  —  will  employ  ordinary  “and”,  “or”, 
“not”,  “if-then”  or  “implies”,  “iff’  for  “if  and  only  if’,  i.e.,  logical  equivalence,  etc. 

Multivariable  notation  will  be  applied  when  more  efficient  than  writing  out  arguments  or  vector 
components.  For  example  the  family  of  events  (aj)j  m  j  for  some  finite  index  set  J,  can  also  be 
denoted  as  simply  aj,  the  repeated  conjunction  &(aj)  can  also  be  denoted  as  &(aj);  &(a'b)j  for 
&  (a/b.) ;  v(aj)  for  v  (aj);  P(aj)  =  0j,  for  P(aj)  =  0,  j  in  J.  Also,  lm  is  that  m  by  1  vector,  each  of 

,  j in  J 

jin  J 

whose  one-dimensional  components  is  1,  with  an  analogous  definition  for  0m.  More  generally, 
when  unambiguous,  lj  is  that  vector  of  card(J)  components,  each  being  1,  etc.  For  any  matrix  or 
vector  A,  sum(A)  is  simply  the  sum  of  all  of  its  elements.  In  a  related  vein,  iterated  summations 
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such  as  ^(P(aj)),  V(P(aj  I  bj)) ,  V(P(ajbj))  all  will  be  streamlined,  whenever  unambiguous, 

j  in  J  j  in  J  j  in  J 

to  be,  respectively,  L( P(a)j)  ,  Z(P(alb)j),  L(P(ab)j),  etc.  Further  multivariable  notation  will  be 
provided  as  needed.  In  addition,  four  special  binary  boolean  operators  at  times  will  be  of  use  and 
are  indicated  in  action  for  any  a,  b  in  B  as: 

(i)  material  conditional  or  logical  implication  -  the  classic  logic  truth-table  counterpart  (or 
event  indicator  function)  is  only  false  when  the  antecedent  is  true  and  the  consequent  is  false, 
and  in  a  sense,  most  naturally  models  “if-then”  or  conditional  statements  from  a  classical  logic 
viewpoint 


“if  b,  then  a”  becomes  b=>a  =d  b'  v  a  =wb' vab  =w  (a'b)'  =w  (a— ib)',  (1.1.1) 


where  “— ”  is  defined  next. 

(ii)  (non-symmetric)  event  difference 

“b  and  not(a)”  becomes  b  — ia  =d  a'b.  (1.1.2) 

(iii)  symmetric  event  difference 

“(b  and  not(a))  and  (a  and  not(b))”  becomes  aAb  =d  (a'b)  v  (b'a)  =w  (a<=>b)' ,  (1.1.3) 

where  is  defined  next. 

(iv)  logiccd  equivalence 

“a  iff  b”  or  “(if  b,  then  a)  and  (if  a,  then  b)”  becomes 

a<=>b  =d  (ab  v  a'b')  =w  ((b=>a)&(a=>b))  =w  (aAb)'.  (1.1.4) 

In  addition  to  being  aware  of  the  elementary  properties  of  probabilities,  we  will  need  on  several 
occasion  to  make  use  of  the  Frechet-Hailperin-Hoeffding  tightest  general  probability  bounds  on 
conjunction  (see,  e.g.,  [Hailperin,  1965,  1984]):  For  any  given  probability  space  (f2,5,P)  and 
events  aj  in  B ,  j  in  J,  for  some  finite  index  set  J, 

max(  V  ( P(aj))  -  (card(J)  -  1),  0)  =d  L(aj,P)  <  P(&(a,))  <  U(aj,P)  =d  min(  P(aA)  .  (1.1.5) 

jtl  i-i 

From  now  on,  the  inequalities  in  eq. (1.1.5)  will  be  referred  to  as  the  FHH  inequalities. 

Finally,  a  word  on  when  conditional  probabilities  are  well  defined.  Generally  speaking, 
conditional  probabilities  such  as  P(alb)  are  only  meaningful  when  P(b)  >  0,  in  which  case  the 
standard  definition  holds 


P(alb)  =d  P(ab)/P(b). 


(1.1.6) 
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But,  at  times,  individuals  have  found  it  convenient  to  extend  the  standard  definition  to  either 
yielding  unity  or  zero  for  the  case  when  P(b)  =  0.  In  particular,  Adams  has  made  such  use  in 
developing  his  probability  logics  [Adams,  1996].  While  later  we  will  consider  the  situation 
where  in  fact  P(b)  =  0  (although  not  actually  defining  P(alb)  directly  for  that  case),  we  will 
distinguish  carefully  through  this  paper  between  the  standard  case  -  when  P(b)  >  0  -  and  the 
nonstandard.  From,  now  on,  whenever  the  simple  symbol  P(a\b)  is  used,  it  is  assumed  that  P(b) 
>  0. 

1.2  One  Motivation  for  the  Work:  The  Transitivity  Problem 

To  illustrate  the,  unfortunately,  all-to-often  occurring  discrepancy  between  probabilistic  and 
commonsense  reassuming,  consider  the  following  example.  First,  let  us  abbreviate  the  following 
statements/events:  b  =d  “enemy  is  secretly  amassing  over  100,000  troops  ready  to  attack”;  c  =d 
“political  negotiations  will  fall  through  and  it  will  be  foggy  tomorrow  morning”;  and  a  =d 
“enemy  will  attack  tomorrow  morning”.  Suppose  in  this  situation  that  previously  acquired 
intelligence  information  indicates  that  all  three  events  are  neither  certain  nor  impossible  and  that 
estimates  of  the  following  two  conditional  probabilities  are  the  only  reliable  available 
information:  P(alb)  =  0.9  (approximately)  and  P(blc)  =  0.8  (approximately). 

What  can  we  say  about  the  critical  desired  probabilities  such  as  P(a)  or  P(alc)  ?  Using  the  basic 
laws  of  probability,  it  can  be  shown  that,  unless  we  make  further  assumptions,  the  above  two 
probabilities  can  take  essentially  any  values  in  the  unit  interval.  More  specifically,  Figure  1 
illustrates  why,  in  general,  with  lack  of  any  specific  assumptions,  one  could  have  both  P(alb)  and 
P(blc)  very  high,  but  P(alc)  low  or  even  zero.  There  the  triangles  represent  any  three  overlapping 
events  a,  b,  c  and  the  enclosing  rectangle  represents  CL.  A  probability  measure  P  is  chosen  with 
mass  to  be  distributed  over  a,  b,  c  so  that,  as  usual  P(F£)  =  1.  The  probability  assignments  are 
shown  for  the  various  regions  (or  relative  atoms)  scoped  out  by  conjunctions  of  combinations  of 
affirmations  and  negations  of  a,  b,  c,  where  VVL  indicates  “very,  very  low”  (but  not  zero),  VL 
indicates  “very  low”,  L  indicates  “low”,  and  H  indicates  “high”. 


P(abc)  =  VVL,  P(abc')  =  H. 

P(a'bc)  =  L,  P(a'b'c)  =  VL, 

P(ab'c')  =  VL,  P(a'bc')  =  VVL. 

(Typically  here,  VVL  =  0.001,  VL  =  0.01, 
L  =  0.1,  H  =  0.878.) 

P(alb)  =  (H  +  VVL)  /  (H  +  L  +  2 VVL)  =  1 
(Typ.  Val:  0.897) 

P(blc)  =  (L  +  VVL)  /  (L  +  VL  +  VVL)  =  1 
(Typ.  Val:  0.910) 

P(alc)  =  (VVL)  /  (L  +  VL  +  VVL)  «  0 
(Typ.  Val:  0.0090) 


Figure  1 .  Example  for  conditional  probability  extension  of  classical 
transitivity-syllogism  problem  where  premise  set  has  high  conditional 
probability  values,  but  conclusion  has  a  low  (or  even  zero)  conditional 
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probability  value. 


Let  us  return  to  the  general  problem  of  given  knowledge  of  P(alb)  and  P(blc),  to  determine  P(alc) 
in  some  way.  If  we  knew  also  P(al  be)  and  P(al  b'c),  then  using  also  P(blc)  (without  using  P(alb)), 
we  could  easily  determine  P(alc),  as  the  reader  can  check.  Or,  if  we  make  the  P-conditional 
independence  assumption  that  b  alone  without  c  is  “sufficient”  for  a,  i.e.,  P(a!  be)  =  P(alb),  and 
again,  that  P(al  b'c)  is  now  known,  we  could  determine  P(alc)  completely  by  elementary 
probability  considerations,  using  both  P(alb)  and  P(blc),  as  in 

P(alc)  =  P(ab  I  c)  +  P(ab'  I  c)  =  P(a  I  bc)P(blc)  +  P(alb'c)P(b'lc)  =  P(alc)P(blc)  +  P(alb'c)(  1  -P(blc)).  (1.2.1) 

Still  other  assumptions  can  be  made  about  a,  b,  c  and  P  to  estimate  P(alc)  in  some  sense.  (See, 
e.g.,  Section  6  of  [Bamber  et  al.,  2000].) 

On  the  other  hand,  intuitively  speaking ,  when  P(aib)  and  P(blc)  are  both  high,  even  though  it  is 
possible  that  some  probability  configuration  between  a,  b,  and  c  may  even  yield  P(alc)  =  0,  it 
seems  that  on  the  average  (whatever  that  means  !)  P(alc)  should  also  be  high  -  though  possibly 
somewhat  lower  than  both  0.9  and  0.8.  This  intuitively  desirable  property,  called  at  times 
transitivity ,  chaining,  or  hypothetical  syllogism,  in  the  literature  concerning  the  extension  of 
classical  reasoning  to  a  probability  framework  [Bamber  et  al.,  2000;  Pearl,  1988]  itself,  has  been 
the  center  of  much  controversy  over  the  past  several  years  in  attempting  to  design  rule -based 
systems  which  follow  the  laws  of  probability,  but  also  agree  with  commonsense  reasoning  as  the 
above  example  illustrates.  One  basic  reason  for  this  is  that  rule -based  systems  usually  operate 
upon  the  sequential  “firing”  of  rules,  i.e.,  when  the  antecedent  of  one  rule  matches  the 
consequent  of  another.  But  —  as  is  often  the  case  —  when  such  rules  are  actually  not  100% 
reliable,  but  for  purpose  of  convenience  (and  the  usual  real  world  tradeoff  in  using  something 
that  is  highly  reliable  but  not  perfect)  still  form  part  of  the  system,  in  effect,  the  transitivity 
problem  is  present,  even  if  one  tacitly  ignores  it  to  perform  the  functioning  of  the  system.  In  a 
related  vein,  one  should  mention  tacit  alternatives  to  the  problem  of  extending  transitivity  and 
other  desirable  properties  of  reasoning  systems  to  a  probability  framework  via  “certainty  factors” 
utilized  in  [Buchanan  &  Shortliffe,  1984]  and  other  ad  hoc  procedures  for  combining  reliabilities 
of  inference  rules,  as  discussed  in  [Hayes-Roth  et  al.,  1983]. 

Note  also  that,  instead  of  interpreting  the  above  conditional  statements  via  naturally 
corresponding  conditional  probabilities,  the  statements  “if  b,  then  a”,  “if  c,  then  b”,  and  “if  c, 
then  a”,  could  be  first  modeled  through  the  classical  logic  (or  boolean  algebra)  material 
conditional  operator  and  then  evaluated  probabilistically.  In  that  case,  it  easily  follows  that  since 

(b=>a)  &  (c=>b)  =  b'c'  v  ab  <c'va  =  c  =>  a,  (1-2.2) 

by  the  monotonicity  property  of  probability,  for  any  P  over  B, 

P((b=>a)  &  (c=>b))  <  P(c  =>  a),  (1.2.3) 

and  applying  the  lower  bound  FHH  to  eq.(1.2.3),  we  obtain 
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[P(b=>a)  >  s,  P(c=>b)  >  t  ]  implies  [P(c  =>  a)  >  s  +  t  -  1],  for  any  Vi  <  s,  t  <  1,  (1.2.4) 

and  thus  have  a  seemingly  satisfactory  solution  to  the  transitivity  problem  (where  both  the 
probabilistic  analysis  and  commonsense  reasoning  apparently  agree). 

However,  one  basic  fact  that  precludes  a  consistent  use  of  (b=>a)  in  interpreting  the  conditional 
expression  “if  b,  then  a”,  and  the  subsequent  probability  evaluation  P(b=>a)  for  measuring  the 
degree  of  reliability  of  the  rule  “if  b,  then  a”,  or  the  uncertainty  of  same  rule,  is  that  the 
evaluation 

P(b=>a)  =  1-  P(b)  +  P(ab)  (1.2.5) 

increases  toward  unity  when  P(b)  decreases  down  toward  zero,  regardless  of  the  relationship 
between  P(b)  and  P(ab)  -  which,  of  course  P(alb)  completely  respects.  Moreover,  a  well-known 
inequality  provides  a  good  quantitative  measure  of  the  difference  between  the  two  approaches  to 
modeling  (see,  e.g.,  [Calabrese,  1987;  Goodman  &  Nguyen,  1995]  for  further  discussions) 

P(b=>a)  =  P(alb)  +  P(b/)P(a/lb)  >  P(alb).  (1.2.6) 

On  the  other  hand,  when  P(alb)  =  1,  P(blc)  =  1,  a  little  manipulation  shows  necessarily  P(alc)  =  1. 
In  fact,  this  case  generalizes  the  classical  (barbara- type)  of  syllogism  typified  by  the  well-known 
paradigm  “All  men  are  mortal'’,  “I  am  a  man”,  therefore  “I  am  mortal”  (See,  e.g.,  [Prior  et  al., 
1967;  Copi,  1986;  Goodman,  1999]  for  discussion  and  background  on  this  classical  syllogism.) 
Thus,  we  see  that  a  real  sort  of  discontinuity  exists  between  the  certain  conditional  probability  or 
classical  reasoning  case  and  the  general  nontrivial  conditional  probability  case  for  potential 
transitivity,  keeping  in  mind  the  additional  difficulty  illustrated  above  that  the  material 
conditional-plus-probability-evaluation  approach  is  also  not  satisfactory,  despite  its  formally 
satisfying  transitivity  at  all  probability  levels. 

Besides  the  transitivity  problem,  a  number  of  other  fundamental  problems  exist  in  reasoning 
which  also  yield  similar  apparent  discrepancies  with  our  commonsense  understanding,  including 
contraposition  and  strengthening,  discussed  later.  In  the  case  of  all  of  the  above-mentioned 
problems,  the  desired  probability  subject  to  the  given  constraints  is  so  underspecified  that  in 
general  it  can  range  over  the  entire  unit  interval.  In  such  situations  it  appears  that  a  number  of 
previously  established  approaches  to  estimating  varying  probabilities  may  not  be  adequate.  This 
includes  the  many  general  bounding,  probability-bounding,  upper  and  lower  probability 
techniques,  and  random  set  and  related  (belief,  plausibility,  etc.)  function  techniques,  as 
provided,  e.g.,  in  [Alefeld  &  Herzberger,  1983;  Hailperin,  1996;  Walley,  1991;  Dempster,  1967; 
Shafer,  1976;  Goodman  et  al.,  1997].  On  the  other  hand,  these  techniques,  used  with  appropriate 
caution  (see,  e.g.,  [Nguyen,  1978;  Chapters  3,  4  of  Goodman  &  Nguyen,  1985])  may  provide  a 
viable  alternative  to  that  which  we  present  here  in  the  subsequent  sections.  In  yet  another 
direction,  there  is  the  basic  -  or  naive  -  maxumum  entropy  approach,  which  picks  a  specific  P  - 
and  then  uses  that  P  to  evaluate  the  desired  conclusion  probability  —  through  maximizing 
entropy,  subject  to  the  constraints  of  the  problem.  This,  indeed,  may  also  furnish  a  possible 
reasonable  approach  to  these  issues,  as  developed,  e.g.,  in  [Rodder,  2000],  based  on  general 
principles  as  found,  e.g.,  in  [Kapur,  1994].  However,  as  in  the  bounding  approaches,  such  use  of 
maximum  entropy  must  be  carried  out  with  caution,  as  will  be  seen  later. 
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1.3  Use  of  Second  Order  Probability  in  Addressing  the  Transitivity  Problem 

In  [Goodman  &  Nguyen,  1998,  Goodman,  1999  and  Bamber  et  al,  2000]  it  was  shown  that  a 
reasonable  way  to  analyze  the  transitivity  problem  within  a  completely  rigorous  mathematical 
framework,  compatible  with  all  the  laws  of  probability,  is  to  interpret  the  above  expression  on 
average  to  mean  that,  instead  of  attempting  hopelessly  to  pick  out  the  “true”  value  P(alc)  from 
the  unit  interval,  one  should  average  the  value  P(alc)  over  all  possible  choices  of  probability 
measures  P,  subject  to  the  given  constraints  P(alb)  =  0.9  and  P(blc)  =  0.8.  But,  this  bayesian 
method  requires  a  choice  of  second  order  probability,  i.e.,  a  choice  of  probability  distribution  of 
the  probability  measures  themselves!  (Second  order  probability  techniques  have  already  proven 
useful  in  addressing  update  problems  as  in  [Goodman  &  Nguyen,  1999a]  and  may  also  be  found 
in  the  older  treatise  of  [Aitchison,  1986].)  Suppose,  for  simplicity  and  lack  of  any  other 
information,  appealing,  e.g.,  to  a  second  order  maximal  entropy  (or  equivalently  most  ignorance 
of  information  argument)  -  as  opposed  to  the  naive  (first  order)  maximum  entropy  approach 
discussed  earlier  —  we  choose  this  second  order  distribution  to  be  in  a  natural  sense  uniform  over 
the  possible  candidate  probability  measures.  Then,  it  can  be  shown  [Goodman,  1999;  Bamber  et 
al.,  2000]  that  no  matter  what  threshold  values  s  =  P(alb)  and  t  =  P(blc)  are,  a  closed-form 
expression  in  variables  s  and  t  can  actually  be  computed  for  the  P-averaged  P(alc),  which,  in 
agreement  with  commonsense,  does,  in  fact,  approach  unity  as  s  and  t  approach  unity.  In 
addition,  a  reasonable  upper  bound  can  also  be  obtained  for  the  error  variance  (between  this 
estimate  and  actual  possible  values),  as  the  probabilities  vary  uniformly.  Computations  for 
related  procedures  produce  closed-form  results  in  a  number  of  cases  of  interest  besides 
transitivity,  but  these  approaches  do  not,  at  first  analysis,  appear  to  be  generalizable,  because  of 
the  difficulty  in  evaluating  multiple  integrals  over  spaces  of  constrained  probability  measures. 
However,  recent  efforts  have  produced  promising  modifications  and  approximations  applicable 
to  the  general  case,  as  outlined  in  this  paper. 

Returning  to  the  case  of  transitivity,  the  formula  for  the  averaged  value  of  P(alc)  with  respect  to 
P  varying  uniformly  and  P(alb)  and  P(blc)  known,  as  well  as  bounds  on  its  variance  and  expected 
deviation  from  its  limiting  unity  value,  and  bounds  on  the  associated  (second  order)  probabilities 
are  given  below: 

EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)  =  st  +  f(s,t) ,  1  >  s,  t  >Vi,  (1.3.1) 

where  the  correction  term  f(s,t)  >  0  in  eq.(  1.3.1)  is 

f(s,t)  =d  [( 1  -t)/2  -  ( [s(l-s)(2s-l)t(l-t2)]  /  [t+2t2  +  (s(l-s)(l-t)(2+3t-t2))] )]  >  0,  1  >  s,  t  >  Vi 

(1.3.2) 

In  other  words,  for  this  generic  example,  the  averaged-out  value  of  P(alc)  is  approximately  the 
same  as  if  the  “conditional  events  “a  given  b”  and  “b  given  c”  were  P-independent,  for  all  P,  up 
to  the  correction  term  f.  For  the  specific  example  at  hand,  where  s  =  0.9  and  t  =  0.8,  the 
averaged  value  of  P(alc)  is  approximately  0.75  (as  compared  to  0.72  for  the  formal  assumed 
independence).  Note,  as  stated  above,  that  as  s,  t  approach  unity,  f(s,t)  approaches  zero  and  the 
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expectation  approaches  unity,  agreeing  with  the  commonsense  argument.  In  mathematical 
notation  this  is 

limit  [EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)]  =  1.  C1-3-3) 

(s,t->  1) 

In  turn,  because  all  probabilities  lie  in  the  unit  interval,  the  conditional  variance 
Varp(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)) 

=  EP([P(alc)  -EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)]2  I  P(alb)  =  s,  P(cld)  =  t) 

=  Ep((P(alc))2l  P(alb)  =  s,  P(cld)  =  t)  -  [EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)]2 

<  EP(P(alc)l  P(alb)  =  s,  P(cld)  =  t)  -  [EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)]2 

=  Ep(P(alc)l  P(alb)  =  s,  P(cld)  =  t>(  1  -  EP(P(alc)l  P(alb)  =  s,  P(cld)  =  t) ) 

<  1  -  EP(P(alc)l  P(alb)  =  s,  P(cld)  =  t),  (1.3.4) 

where  for  values  of  Ep(P(alc)l  P(alb)  =  s,  P(cld)  =  t)  not  that  close  to  unity,  the  second  to  the 
bottom  expression  in  eq.(1.3.4)  can  be  used  as  an  upper  bound  estimate  of  the  conditional 
variance.  Thus,  also  the  averaged  deviation  of  P(alc)  from  unity,  which  is 

E((P(alc)  -  l)2 1  P(alb)  =  s,  P(cld)  =  t) 

=  E([P(alc)  -  Ep(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)]2  +  [EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)  -  l]2 1 
P(alb)  =  s,  P(cld)  =  t) 

=  Varp(P(alc)  I  P(alb)  =  s,  P(cld)  =  t))  +  [l-EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)]2,  (1.3.5) 

has  also  a  computable  upper  bound  obtainable  from  use  of  eq.(1.3.5).  The  unity  limiting  form  in 
eq.(1.3.3)  shows  that  both  the  variance  and  averaged  deviation  of  P(alc)  from  unity  both 
approach  zero.  Since  the  usual  triangle  inequality  provides 

0  <  l-P(alc)  <  1-  EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t) 

+  I  P(alc)  -  EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)l  (1.3.6) 

the  standard  Chebychev  inequality  (see,  e.g.,  page  95  of  [Rao,  1973])  shows,  for  any  X  >  0, 
denoting  Prob  as  the  corresponding  second  order  (posterior)  probability  measure  determined 
through  the  uniform  prior  we  chose  for  the  P’s, 

Prob(  [P(alc)  -  EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)]2  >  X  I  P(alb)  =  s,  P(cld)  =  t)) 

<  (lA)-Varp(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)).  (1.3.7) 

Combining  eqs. (1.3.4),  (1.3.6),  (1.3.7),  for  any  X  >  0, 

Prob(  [1-P(alc)]2>  X  I  P(alb)  =  s,  P(cld)  =  t)) 

<  (1-  Ep(P(alc)  I  P(alb)  =  s,  P(cld)  =  t))/  ( Xm -  [1-  EP(P(alc)  I  P(alb)  =  s,  P(cld)  =  t)])2.  (1.3.8) 
Finally,  eq.(1.3.3)  applied  to  eq.(1.3.8)  shows  that 
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(1.3.9) 


limit  (Prob(  [l-P(alc)]2  >  X  I  P(alb)  =  s,  P(cld)  =  t)))  =  0, 

(M->  1) 


at  least  the  rate  described  in  eq.(1.3.8),  so  that  in  standard  probability  parlance,  P(alc)  converges 
in  (second  order)  probability  to  unity,  under  the  conditions  P(alb),  P(blc)  themselves  converge  to 
unity,  assuming  otherwise,  any  P  is  equally  likely. 

2.  Use  of  Second  Order  Probability  in  Analyzing  Whether  Various  Types  of  Desirable 
Reasoning  Properties  Can  Be  Extended  to  a  Probability  Setting:  An  Introduction 

2.1  Some  Particular  Examples 

As  in  the  transitivity  problem  discussed  in  Section  1.3,  a  second  order  probability  approach  can 
be  used  to  determine  whether  particular  types  of  desirable  properties  that  classical  logic 
possesses  —  or  more  general  reasoning  systems  should  possess  —  carries  over  to  a  probability- 
based  reasoning  system. 

In  addition  to  transitivity ,  closed-form  results  have  been  obtained  involving  a  number  of  other 
reasoning  system  properties,  including  contraposition ,  positive  conjunction ,  and  strengthening. 

In  the  case  of  contraposition,  the  premise  involved  in  natural  language  form  is  “if  b,  then  a”  and 
the  (potential)  conclusion  is  “if  not(a),  then  not(b)”,  a  well  known  valid  classical  logic  property  - 
-  in  fact,  one  will  recall  in  boolean  form  this  is  the  same  as  the  identity 


b=>a  =  a'=>b'.  (2.1.1) 

When  the  above  forms  are  not  only  interpreted  through  the  material  conditional  operator,  but 
also  evaluated  via  probability,  just  as  in  the  attempt  for  addressing  the  transitivity  problem  (see 
eq. (1.2.4)),  the  classical  logic  property  seems  to  extend  trivially  to  a  probability  setting: 

[P(b=>a)  >  s  ]  implies  [P(a'  =>  b')  >  s],  forany0<s<l.  (2.1.2) 

But,  again,  as  in  the  transitivity  case  (see  the  remarks  following  eq.(1.2.4)),  because  of  the  basic 
undesirable  consequences  in  using  probability  evaluations  of  the  material  conditonal  operator  as 
interpretations  of  probability  evaluations  of  conditional  expressions,  such  an  extension  is  not 
satisfactory.  On  the  other  hand,  just  as  in  the  transitivity  case  (again,  see  Figure  1  of  Section 
1.2),  in  general,  given  only  the  information  P(alb)  =  s,  for  even  s  quite  close  to  unity,  we  can  find 
particular  probability  measures  P  such  that  P(b'la')  is  close  to,  or  actually  zero.  Thus,  once  more, 
we  are  led  to  seek  if  a  second  order  probability  approach  to  this  quandary  produces  a  more 
reasonable  result.  Specifically,  we  ask  what  is  the  P-averaged  value  of  P(b'la')  for  given 
constraint  P(alb)  =  s  and  if  that  evaluation  approaches  unity  as  s  approaches  unity. 

In  the  case  of  positive  conjunction,  the  premise  involved  in  natural  language  form  is  {“if  b,  then 
a”,  “if  c,  then  a”}  and  the  (potential)  conclusion  is  “if  b  and  c,  then  a”.  Once  again,  as  in  the 


transitivity  and  contraposition  cases,  the  classical  logic  interpretation  via  the  material  conditional 
operator  in  boolean  form  is  easily  seen  to  produce  the  inequality 

(b=>a)&(c=>a)  =  b'c  v  b'ac  vc'ab  <  (be  =>a),  (2.1.3) 

whence,  analogous  to  the  use  of  the  FHH  lower  bound  to  produce  eq.(  1.2.4), 

[P(b=>a)  >  s,  P(c=>a)  >  t  ]  implies  [P(bc  =>  a)  >  s  +  t  -  1],  for  any  Vi  <  s,  t  <  1.  (2.1.4) 

But,  once  more,  as  in  both  the  transitivity  and  contraposition  cases,  the  potential  value  of  the 
result  in  eq.(2.1.4)  is  diminished  by  the  difficulties  involving  the  probability  of  the  material 
conditional.  And,  once  more,  it  can  be  shown  that  probabilities  P  exist  for  which  P(alb)  =  s  and 
P(alc)  =  t,  with  s  and  t  reasonably  high,  yet  P(al  be)  quite  low  or  even  zero.  Thus,  again  we  are 
led  to  seek  if  a  P-averaged  evaluation  of  P(al  be)  subject  to  these  constraints  produces  a  more 
reasonable  result. 

Finally,  we  mention  the  property  of  strengthening ,  where  in  a  natural  language  setting  the 
premise  is  “if  b,  then  a”  and  the  (potential)  conclusion  is  “if  b  and  c,  then  a”.  The  classical  logic 
counterpart,  using  again  the  material  conditional  interpretation  of  “if-then”,  is  seen  to  be  valid 
via  the  simple  inequality  in  boolean  form 

(b=>a)  =  b' va  <  b'vc'va  =  (be  =>a),  (2.1.5) 

which  if  P  were  to  be  applied  to  both  sides  of  eq.(2.1.4),  via  the  standard  monotonicity  property 
of  probability,  produces  the  equally  simple  relation 

[P((b=>a)  >  s]  implies  [P(bc  =>a)  >  s]  ,  for  all  s,  0  <  s  <  1.  (2.1.6) 

However,  as  in  the  cases  of  transitivity,  contraposition,  and  positive  conjunction,  the  same 
pattern  of  difficulty  holds:  First,  eq.(2.1.6)  is  of  limited  valued  due  to  the  general  difficulty  in 
using  the  probability  of  the  material  conditional  operator  approach  to  modeling  uncertain 
conditioned  information;  second,  “counterexamples”  can  be  found  for  which  P(alb)  is  quite  high, 
yet  P(al  be)  is  very  low  or  zero.  Thus,  yet  again,  we  whether  by  suitably  averaging  P(a  I  be)  over 
possible  P’s,  subject  to  the  constraint  P(alb)  =  s,  we  can  obtain  a  computable  function  of  s  which 
approaches  unity  as  s  does. 

Table  1  not  only  provides  a  summary  of  closed-form  computations  for  P-averaged  conclusion 
probabilities  for  transitivity,  contraposition,  positive  conjunction,  and  strengthening,  but  also  for 
a  number  (about  thirty  total)  of  other  properties  (desired  and  undesired)  of  logical  systems  in  a 
second  order  (uniform  prior)  probability  framework.  In  the  case  of  the  four  examples  discussed 
so  far,  all  of  them  yield  P-averaged  conclusions  that  do  indeed  approach  unity  as  threshold(s)  s 
and/or  t  approach  unity,  compatible  with  commonsense  reasoning. 

2.2  Generalizations 

In  order  to  put  the  above  ideas  on  a  more  rigorous  and  general  basis,  consider  the  following: 
Begin  with  a  probability  space  (X2,B,P),  with  now  P  arbitrarily  variable  and  given  events  aj,  bj,  c, 
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d  in  B ,  for  which  we  assume,  without  loss  of  generality  (attempting  to  avoid  trivial  results),  that 
0  <  aj  <  bj,  all  j  in  J,  and  0  <  c  <  d.  We  also  consider  a  given  premise  set  (ajlbj)j  m  j,  where, 
formally,  each  “conditional  event”  (ajlbj)  corresponds  to  the  conditional  expression  “if  bj,  then 
aj”,  before  being  evaluated  via  any  choice  of  (well  defined)  P  as  P(ajlbj),  etc.,  or  in  multivariable 
notation,  (alb)j  forms  the  premise  set,  while,  for  purpose  of  simplicity,  the  single  conditional 
event  (cld)  forms  the  potential  conclusion.  In  turn,  we  seek  to  determine  two  types  of  mean 
conclusion  functions:  meanconCi((alb)j,  (cld)):  A  — >  [0,1],  i  =  1,  2,  where  A  is  a  subset  of  [0,1]J, 
such  as  (open  interval(l/2,l))J,  and  for  each  choice  of  tj  =  (tj)j  j  in  A, 

meanconci((alb)j;  (cld))(tj)  =d  EP(P(cld)  I  P(alb),  =  tj) ,  (2.2.1) 

meanconc2((alb)j;  (cld))(tj)  =d  EP(P(cld)  I  P(alb)j  >  tj) ,  (2.2.2) 


where  symbolically  P(alb)j  =  tj  means  P(ajlbj)  =  tj,  for  all  j  in  J,  and  similarly,  P(alb)j  >  tj  means 
P(ajlbj)  >  tj  ,  for  all  j  in  J.  The  mean  conclusion  functions  will  be  fully  rigorously  determined 
once  we  rigorize  what  was  meant  earlier  by  any  P,  as  being  “equally  likely”,  when  P  varies.  A 
natural  way  to  capture  this  is  as  follows:  Suppose,  first  that 

l\  =d  {a, . am+i}c  B  (2.2.3) 

is  a  set  of  atoms  of  B  with  respect  to  ((alb)j;(cld)).  That  is,  the  tXj’s  form  a  nonvacuous, 
exhaustive  partitioning  of  £2,  with  each  aj  in  B ,  so  that  any  nonvacuous  boolean  function 
f(V0((alb)j;(cld))  of  event  variables  V0((alb)j;(cld))  from  the  antecedents  and  consequents  of 
premise  set  of  conditionals  (alb)j  and  potential  conclusion  conditional  (cld),  where 

((alb),;(cld))=d  [aj,  a/bj,  b/:  j  in  J]  u  [c,  c'd,  d'},  (2.2.4) 

can  be  (uniquely,  necessarily)  expressed  as  a  disjoint  disjunction  of  Oj’s,  indicated  through  use  of 
the  index  set  I(.)  cz  { l,...,m+l }  in 

f(V0((alb),;(cld))  =  v^)  .  (2.2.5) 

jin  I(f(V((alb)Jo;(cU)) 

To  do  this  it  is  clearly  sufficient  to  determine  whether  each  possible  conjunctive  combination  of 
affirmations  and  negations  of  all  of  V0((alb)j;(cld))  is  either  equal  to  0  or  a  disjoint  disjunction  of 
Oj’s.  Note  that  the  coarsest  set  of  atoms/smallest  possible  set  of  atoms/set  of  atoms  generated  by 
A  with  respect  to  ((alb)j;(cld))  is  precisely  that  set  of  all  such  nonvacuous  possible  combinations 
as  above  and  is  denoted  as 

A(V0)((alb),;(cld)).  (2.2.6) 

Thus,  any  choice  of  P  with  respect  to  its  evaluations  of  all  possible  boolean  functions  over 
Vo((alb)j;(cld))  is  uniquely  determined  by  the  evaluation  of  P  at  each  atom  aj  in  A .  In  fact,  the 
vector  P  =  (P(aj))j  m  j,  for  the  purpose  of  further  analysis,  may  now  be  identified  with  P  and  the 
set  of  possible  P’s  relevant  to  any  such  investigation  becomes  simply  the  simplex  [P:  0j  <  P  <lj 
and  sum(P)  =  1,  where  P  is  otherwise  arbitrary}.  For  convenience,  the  above  simplex  being 
actually  of  dimension  one  less  than  card(J),  and  therefore  singular  with  respect  to  J,  is  replaced 


10 


by  designating  one  atom,  say,  ocm+i,  and  omitting  its  evaluation  from  P  in  the  above  simplex,  but 
still  keeping  track  of  it.  That  is,  we  consider  the  m-simplex,  replacing  P  by  variable  X  (with  m 
components,  the  ith  component  being  x;  =  P(aO)  and  the  missing  component, 


xm+i  =  P(am+i)  =  1-  sum(X)  (2.2.7) 

and 

Sm  =d  {X:  0m  <  X  <lm,  sum(X)  <  1 },  (2.2.8) 

Then,  the  concept  of  P  being  equally  likely  becomes  equivalent  to  assuming  X  can  be  identified 
as  a  random  vector  uniformly  distributed  over  Sm  (which,  in  turn,  determines  the  behavior  of  the 
missing  component).  In  terms  of  bayesian  analysis,  this  corresponds  to  choosing  a  prior 
probability  distribution,  which  is  uniform  over  Sm.  This  distribution  is  a  second  order  one  in  the 
sense  described  previously:  it  is,  in  effect,  a  distribution  of  probabilities  themselves,  as 
implicitly  stated  above,  the  set  of  atoms,  A,  could  be  chosen  to  be  A(V0)((alb)j;(cld)).  Choice  of 
the  most  appropriate  set  of  atoms  is  somewhat  arbitrary,  but  since  all  results  depend  on  this 
choice,  the  simplest  and  most  natural,  at  times,  may  be  A(V0)((alb)j;(cld)). 

Also,  at  times,  it  will  be  convenient  to  consider  a  class  of  possible  priors  to  choose  from,  rather 
than  be  restricted  to  just  the  uniform  distribution.  One  family  of  distributions  over  Sm  that 
includes  the  uniform  one,  has  a  natural  characterization  compatible  with  the  modeling  here,  and 
possesses  many  desirable  properties  —  including  closure  with  respect  to  all  index  disjoint  sums 
and  marginals,  among  others  -  is  the  Dirichlet  family,  indicated  symbolically  as  dir(X),  with 
(m+1  by  1)  parameter  vector  X  >  0m+i .  The  parameter  vector  X  is  associated  with  the  expectation 
of  dir(X),  (in  fact,  it  is  such  up  to  normalization  via  sum(x))  and  prior  knowledge,  if  available,  of 
the  expectation  can  be  transformed  into  a  choice  of  X.  For  the  special  case  of  the  uniform 
distribution  over  Sm,  X  =  1  m  and  all  the  component  (x;)  expectations  are  identical  to  l/(m+l).  For 
more  details  on  properties  and  characterization  of  the  Dirichlet  family,  see  [Goodman  &  Nguyen, 
1999a;  Section  7.7  of  Wilks,  1963;  Chapter  40  of  Johnson  &  Kotz,  1972]. 

Returning  to  the  interpretation  of  type  1  meanconc  functions  in  eq.(2.2.1),  in  light  of  the  atomic 
representation  of  any  event  and  that  of  any  probability  measure  here,  the  conditional  probabilities 
involved  can  be  reinterpreted  as  simple  bilinear  functions  of  variable  X,  provided  that  the 
designated  atom  am+i  <  Letting  aj  correspond  to  the  m  by  1  vector  of  one-dimensional 

components  being  either  1  or  0,  depending  on  whether  for  the  ith  component  atom  a;  <  aj  or  not 
(i.e.,  in  the  latter  case,  necessarily  disjoint  from  aj),  then 

P(ajlbj)  =  ajT-X/(ajT-X  +  (aj'bj)  T-X)  ,  jinJ;  P(cld)  =  cT-X  /  (cT-X  +  (c'd)T-X).  (2.2.9) 

Using  eq.(2.2.9),  each  probability  constraint  (P(ajlbj)  =  tj)  in  the  antecedent  of  the  conditional 
expectation  in  eq.(2.2.1)  becomes 

ajT-X  =  tj-  (ajT-X  +  (a/bj)  T-X),  j  in  J 


i.e.,  the  jth  plane  in  variable  X  determined  by 

((l-tj)-aj  -tj*(a/bj))T-X  =0,  jinJ.  (2.2.10) 
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The  counterpart  of  eq.(2.2.10)  for  the  type  2  meanconc  function  is 


((l-tj)-aj  -tj-(a/bj))T-X  >0,  jinJ, 


(2.2.11) 


corresponding  to  the  ordinary  probability  relations 


P(Vbj)  <  ((l-tj)/tj)-P(aj) ,  j  in  J. 


(2.2.12) 


With  all  of  the  above  stated,  eqs.(2.2.1)  and  (2.2.2)  can  be  rigorously  interpreted,  whether  for  the 
special  case  of  X  being  uniform  over  Sm  or  the  more  general  case  where  X  is  distributed  as  dir(X) 
over  Sm.  As  seen  in  the  transitivity  case  earlier,  in  addition  to  determining  meanconc((alb)j; 
(cld))(tj)  for  all  tj  in  its  domain,  the  limiting  case  limit  (meanconCi((alb)j;(cld))(tj)  is  of  interest. 


limit  (meanconCi((alb)j;(cld))(tj))  =  1,  (2.2.13) 

CtjTi,) 

say  that  (cld)  is  deduced  from  (alb)j  in  the  expected  (or  averaged )  probabdity  logical  sense  i,  i  = 
1,  2.  (Bamber  [2000]  prefers  the  term  “near  surety”  in  developing  a  related  logic.)  When  this 
holds,  we  will  write  this  relation  as 

(alb)j  <epl  (cld).  (2.2.14) 


3.  Probability  Estimation  Procedures  and  Their  Relation  to  the  Expected  Probability 
Logic  Approach:  A  First  Glimpse 

While  the  approach  taken  here  uses  the  mean  conclusion  function  for  determining  properties  of 
reasoning  schemes  in  probability  in  a  second  order  probability  sense,  other  related  approaches  to 
the  same  issues  exist.  Bamber  [2000]  has  pointed  out,  at  least  in  the  limiting  sense,  the  “rational 
closure”  approaches  in  [Lehmann  &  Magidor,  1992]  and  that  of  the  “system  Z”  in  [Pearl,  1990] 
essentially  coincide  with  averaged  surety  deduction  of  type  2.  However,  the  actual  nonlimiting 
case  evaluation  of  the  meanconc  function  has  no  counterpart  in  these  two  approaches.  On  the 
other  hand,  alternative  approaches  based  on  first  order  probability  considerations,  including 
those  of  Adams’  (see  again  [Adams;  1975,  1996]),  where  the  idea  of  a  minimal  conclusion 
function  is  developed  and  that  of  naive  maximal  entropy,  such  as  elaborated  upon  by  [Rodder, 
2000],  can  produce  different  limiting,  as  well  as  non-limiting  antecedent  threshold,  evaluations 
as  compared  to  those  computed  via  meanconc.  Adams’  minimum  conclusion  function 
counterpart  of  the  mean  conclusion  function  is  the  function  minconc((alb)j;(cld)):  A  — »  [0,1], 
where  for  any  tj  in  A  e  [0,1]J,  for  convenience,  giving  here  only  the  type  two  counterpart, 

minconc2((alb)j,  (cld))(tj)  =d  inf[P(cld):  all  possible  P,  P(alb)j  >  tj}.  (3.1) 

The  standard  naive  maximum  entropy  function  counterpart  is  the  function  maxent((alb)j;  (cld)):  A 
— >  [0,1],  where  for  convenience  we  give  the  type  1  counterpart 
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where 


maxenti((alb)j;(cld))(tj)  =d  P*(cld) , 

P*  =  arg(inf{ent(P):  all  possible  P,  P(alb)j  =  tj})  , 


(3.2) 


(3.3) 


with  entropy  of  any  P  given  as  usual  as  (recalling  XT  =  (xi,...,xm),  xm+i  =  1  -  sum(X)), 


m+1 

ent(P)  =  £(-Xj  ■  log(Xj)) .  (3.4) 

j=i 

For  a  basic  application  of  the  basic  maximum  entropy  approach  as  outlined  in  eqs.(3.2)-(3.4), 
where,  typically,  the  method  of  Lagrange  multipliers  is  employed  to  seek  for  the  maximum  with 
respect  to  the  constraints,  see  [Van  Fraasen,  1981].  For  a  criticism  of  this  approach  which 
produces  in  a  sense  a  “nonintuitive”  result  as  opposed  to  the  result  using  a  second  order 
probability  approach,  see  [Grove  et  a!.,  1997]  and  Goodman  &  Nguyen’s  follow-up  and 
generalization  of  the  issue  [Goodman  &  Nguyen,  1999]. 

Thus,  corresponding  to  eq.(2.7),  we  can  state  that  (eld)  is  deduced  from  (alb)j  in  the  minconc 
sense  i,  if 

limit  (minconCi((alb)j;(cld))(tj))  =  1  (3.5) 

(tjTi,) 


and  we  can  state  that  (eld)  is  deduced  from  (alb)j  in  the  maxent  sense  i,  if 

limit  (maxenti((alb)j;(cld))(tj))  =  1,  i  =1,  2.  (3.6) 

(tjTi,) 


In  Adams’  original  terminology,  minconc  (type  2)  deduction  corresponds  to  his  high  probability 
deduction.  Adams  has  also  introduced  other  criteria  for  valid  deduction  of  (alb)j  to  (eld).  The 
one  of  relevance  here  involves  the  minconc  function  evaluated  at  tj  =  lj  (in  this  case,  type  1  = 
type  2)  which  Adams  calls  certainty  probability  deduction.  More  specifically,  the  criterion  for 
(eld)  being  certain-probability  deduced  from  (alb)j  is  that 

minconc  ((alb)j,  (cld))(  lj)  =  1.  (3.7) 

In  addition,  a  basic  modification  must  be  noted  for  Adams’  concepts.  Apropos  to  comments 
made  earlier  concerning  extending  the  definition  of  conditional  probabilities  when  the 
denominator  is  zero,  i.e.,  the  antecedent  is  assigned  zero  probability:  Adams,  in  effect,  extends 
minconc  to  operate  on  conditional  events  in  such  situations  by  formally  defining  the 
corresponding  “conditional  probability”  to  be  unity.  More  specifically,  in  the  context  of  this 
paper  in  analyzing  Adams’  work,  we  shall  apply  the  term  “strong”  to  the  certainty  probability 
deduction  and  the  minconc  types  1  and  2  deduction  as  already  provided  in  eqs.(3.1),  (3.5),  etc., 
and  “weak”  when  in  such  definitions,  either  P(bj)  =  0  in  the  premise  set  is  allowed  in  the  formal 
form  of  P(ajlbj)  =  1  or  P(d)  =  0  is  allowed  in  the  formal  form  of  P(cld)  =  1.  For  simplicity,  we 
will  use  here 

(alb),  <hpl  (cld) ,  (alb)j  <CPL  (cld),  (3.7) 
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to  indicate  high  probability  (usually,  of  the  strong  type),  certainty  probability  deduction  validity, 
(usually  of  the  weak  type)  respectively,  of  (cld)  from  (alb)j,  where,  when  required  to  make  these 
distinctions,  the  subscript  letters  S,  W,  respectively,  will  be  prefixed  to  indicate  strong  or  weak 
types. 

For  purpose  of  completeness,  we  present  in  Table  1  (a  good  part  of  which  has  already  appeared 
in  [Goodman,  1999])  a  compilation  of  computations  of  the  type  1  mean  conclusion  function  and 
type  2  minimum  conclusion  function  restricted,  for  a  variety  of  combinations  of  premise  sets  and 
potential  conclusions,  including  many  corresponding  to  well-known  reasoning  schemes  of 
classical  logic.  Note  that  in  particular,  not  only  transitivity  (no.  13),  but  also  contraposition  (14), 
positive  conjunction  (15),  and  strengthening  (16)  all  fail  to  be  valid  in  the  strong  HP  sense,  but, 
as  stated  earlier,  are  valid  in  both  the  averaged  surety  sense  and  the  certainty-probability  sense 
(in  at  least  the  weak  sense).  Partial  documentation  for  the  derivations  may  be  found  in  [Bamber 
et  al,  2000],  employing  various  integration  techniques.  The  general  assumption  throughout 
Table  1  is  that  the  set  of  atoms  here  is  A(V0)((alb)j;(cld)),  the  minimal  set  of  atoms  generated  by 
the  premise  and  conclusion  antecedents  and  consequents  (see  eq.(2.2.6))  and  the  random 
probability  vector  X  has  a  uniform  prior  over  Sm. 


Name  and 
Number  of 
Deduction 
Scheme 
(alb) j  Potent. 
Deducing  (cld) 

Given  Levels 
of  Premises: 
P(alb)j  >  tj 
for  Mineonc?; 
P(alb)j  =  tj 
for  Meanconci 

Potent. 

Con- 

clus. 

(cld) 

Minconc2((alb)j; 

(cld))(tj) 

Meanconci((alb)j; 

(cld))(t.,) 

Valid 

For 

CPL  ? 

Valid 

for 

EPL  ? 

Valid 

for 

HPL  ? 

1.  Disjunction 

P(alb)  =  s, 

P(alc)  =  t 

(albvc) 

>  max(s+t-l,0) 

>  max(s+t-l,0) 

YES 

YES 

YES 

2.  Bayes 

P(alb)  =  s, 
P(clab)  =  t 

(clb) 

>  St 

>  St 

YES 

YES 

YES 

3.  Cautious 
Monotonicity 

P(alb)  =  s, 

P(clb)  =  t 

(albc) 

>  max(s+t-l,0) 

>  max(s+t-l,0) 

YES 

YES 

YES 

4.PSCEA 

Order 

P(alb)  =  t, 
for  0  <  a  <  b, 

0  <  c  <  d 

(cld) 

>  t 

>  t 

YES 

YES 

YES 

5.  Reflexivity 

P(alb)  =  t 

(alb) 

t 

t 

YES 

YES 

YES 

6.  Cut 

P(alb)  =  s, 

P(cl  ab)  =  t 

(aclb) 

>  St 

>  St 

YES 

YES 

YES 

7.  Exceptions 
(a'bc',  be  ^  0) 

P(a  1  be)  =s, 
P(a'lb)  =  t 

(clb) 

>  max(s+t-l,0) 

>  max(s+t-l,0) 

YES 

YES 

YES 

8.  Equi valance 

P(alb)  =  s, 

P(bla)  =  t 

aob 

>  St 

(s+t)/[2(  s+t-st)] 

YES 

YES 

YES 

9.  Strict  Modus 
Ponens 

P(alb)  =  s, 

P(b)  =  t 

ab 

St 

St 

YES 

YES 

YES 

lO.General 
Modus  Ponens 

P(al  bvc)  =  s, 
P(b)  =  t 

ab 

>  St 

st  +  (l-t)/2 

YES 

YES 

YES 

11. Condition. 
Bounds  1 

P(alb)  =  t 

b=>a 

>t 

(2+t)/3 

YES 

YES 

YES 

12. Condition. 
Bounds  2 

P(ab)  =  t 

b 

>t 

„  -  tlog(t)  - 1  (1  -  t) 

(1  - 1)2 

YES 

YES 

YES 
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13.Transitivity- 

Syllogism 

P(alb)  =  s, 

P(blc)  =  t 

(ale) 

0 

>  st  +  (l-t)/2  - 

(1  -  s)s(2s  -  1X1  t2 ) 

YES 

YES 

NO 

1  +  2t 

14.Contra- 

position 

P(alb)  =  t 

(b'la') 

0 

,  a  -  oiog(i  -  o 

1/1  + 

t 

YES 

YES 

NO 

15.  Positive 
Conjunction 

P(alb)  =  t, 

P(alc)  =  t 

(al  be) 

0 

(l+t)/3  +  [((l+t)(2-t)/(3t)),9(t)]. 

e(t) 

=  (t2/4)[log((2-t)/t)]/(l-t) 

-  ((l-t)2/4)  log((l+t)/(l-t) 

YES 

YES 

NO 

16.  Strengthen 
Antecedent 

P(alb)  =  t 

(al  be) 

0 

approx,  t  (complicated, 
but  in  closed-form) 

YES 

YES 

NO 

17.Penguin 
Triangle 
abc'd  =£0 

P(alb)  =  r, 

P(blc)  =  s, 

P(dlc)  =  t, 

P(a'b  Id)  =  u 

(a'b  Ic) 

0 

? 

YES 

(weakly) 

YES 

NO 

18.Modified 

Penguin 

Triangle 

P(alb)  =  r, 

P(blc)  =  s, 

P(dlc)  =  t, 
d  <  a'b 

(a'le) 

>  max(s+t-l,0) 

>  max(s+t-l,0) 

YES 

(weakly) 

YES 

YES 

(weakly) 

19.Consequ.  1 

P(alb)  =  t 

a 

0 

(l+t)/3 

NO 

NO 

NO 

20.Consequ.  2 

P(alb)  =  t 

b 

0 

1/3 

NO 

NO 

NO 

21.Consequ.  3 

P(a)  =  t 

(alb) 

0 

1/2)0  +  g(t)), 
g(t)  =  [(l-t)-log(l-t)]  / 1 
-(t'log(t))  /  (1-t) 

YES 

YES 

NO 

22.  Consequ.4 

P(b)  =  t 

(alb) 

0 

1/2 

NO 

NO 

NO 

23  Nixon 
Diamond 

P(ablc)  =  s, 
P(dla)  =  t, 

P(d'lb)  =  t 

(die) 

0 

1/2 

YES 

(weakly) 

NO 

NO 

24.Reverse 

Cond.  Bnd.  1 

P(b=>a)  =  t 

(alb) 

0 

2(1  -  t)log(l  - 1) 

,+  3 

(1  -  t)(2  +  t) 

+ 

t 

YES 

YES 

NO 

25. Re  verse 

Cond.  Bnd.  2 

P(alb)  =  t 

ab 

0 

t/3 

NO 

NO 

NO 

26.  Abduction 

P(alb)  =  s, 

P(a)  =  t 

b 

0 

If  s  >  t :  t/(2s), 

3  2 

t  S(1  -  t) 

If  s  <  t : 

2  2 

2(t  -  2st  +  s) 

NO 

NO 

NO 

27.  Induction 

For  bjC  all  disj. 
v(bjc)  <  c: 

P(a  1  bj&c)  =  tj, 
j=l,...,n; 

(ale) 

0 

? 

NO 

NO 

NO 

2  8.  Augmented 
Induction 

For  bjC  all  disj. 
v(bjc)  <  c: 

P(a  1  bj&c)  =  tj, 
j=l,...,n; 

P(  v(bj)  Ic)  =  s 

(ale) 

>  net,)  -  (i-s) 

>  net,)  -  (i-s) 

YES 

YES 

YES 

29.  Constrained 
Conjunction 

P(a)  =  s, 

P(b)  =  t 

ab 

max(s+t-l,  0) 

(l/2)(min(s,  t)  + 

max(s+t-l,  0)) 

YES 

YES 

YES 

30.  Constrained 
Disjunction 

P(a)  =  s, 

P(b)  =  t 

avb 

max(s,t) 

(l/2)(max(s,  t)  + 
min(s+t,  1)) 

YES 

YES 

YES 
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Table  1.  Tabulation  of  minconc  and  meanconc  functions  and  listing  of  validity-nonvalidity 

of  selected  potential  deduction  schemes  with  respect  to  CPL,  EPL,  and  HPL.  Assumption 
here  is  minimally-generated  set  of  atoms  from  relevant  premise  and  conclusion  antecedents  and 
consequents  and  uniform  prior  over  the  m-simplex  of  resulting  possible  probability  functions. 

We  also  illustrate  in  Table  2  (somewhat  overlapping  with  Table  1)  briefly  how  all  three  functions 
minconc,  meanconc,  and  maxent  can  be  similar  or  quite  divergent  for  various  potential  deduction 
schemes.  (Again,  see  [Bamber  et  al.,  2000]  for  documentation.) 


Name  and 
Deduction 
Scheme 
(alb)j  Potent. 
Deducing  (cld) 

Given  Levels 
of  Premises: 
P(alb)j  >  tj,  for 
minconc2; 
P(alb)j=tj,  or 
P(alb)j>  tj  for 
meanconc 

Potent. 

Con- 

clus. 

(cld) 

Minconc((alb)j;(cld))(tj) 

Meanconc((alb)j;(cld))(tj) 

Maxent((alb)j;(cld))(tj) 

Transitivity- 

Syllogism 

P(alb)  =  s, 

P(blc)  =  t 

(ale) 

0 

>  st  +  ( 1  -t)/2  - 

(1  -  s)s(2s  -  1)(1  -  t2) 

(l+(2s-l)t)/2 

1  +  2t 

Contra¬ 

position 

P(alb)  =  t 

(b'la') 

0 

, ,  (1  -  t)log(l  -  t) 

1/1  + 

t 

1/(1  +((1-S)/S)s) 

Disjunctive 

Syllogism 

P(avb)  =  s, 

P(a')  =  t 

b 

>  max(s+t-l,0) 

s-  (l/2)(l-t) 

s-  (l/2)(l-t) 

Moving  Term 

P(av  be)  =  s, 
ab'c'  >  0 

ab  v  c 

0 

(4/5 )s  +  (l/3)(l-s) 

(4/5)s  +  (l/3)(l-s) 

Simple  Lower 
Bound 

P(a)  >  s  , 

(  1>  s  >  Vi ) 

a 

s 

(l+s)/2 

s 

Table  2.  Comparison  of  minconc,  meanconc,  and  maxent  functions  for  five  selected  possible 
deduction  schemes  under  same  assumptions  as  in  Table  1. 

In  particular,  while  obviously  maxent  and  meanconc  either  coincide  or  are  close  to  each  other  in 
the  first  four  cases  of  Table  2,  they  differ  considerably  with  respect  to  the  bottom  type  of 
possible  deduction  scheme.  Note,  in  fact  their  coincidence  for  both  the  invalid  deduction 
scheme  moving  term  (since  the  limit  as  s  approaches  1  is  4/5  <  1)  and  the  valid  one  disjunctive 
syllogism. 

4.  Additional  Analysis  of  HPL  and  CPL:  Use  of  a  Conditional  Event  Algebra 

Certainly,  as  just  seen  through  a  number  of  examples  in  Tables  1  and  2,  Adams’  two  basic 
reasoning  systems  HPL  and  CPL  are  quite  distinct  from  EPL.  In  fact,  HPL  and  CPL  are 
monotonic  logics  in  the  sense  that  if  (cld)  is  deduced  from  (alb)jin  either  the  HPL  or  CPL  sense, 
then  for  any  other  collection  of  conditionals  (or  unconditionals)  (alb)ic  (with  probability  space 
(X2,B,P)  given,  P  arbitrarily  variable  and  all  aj,  bj,  c,  d  in  B,  etc.,  as  usual),  (cld)  will  also  be 
deduced  in  the  same  sense  by  (alb)juK-  On  the  other  hand,  EPL  is  a  nonmonotonic  logic  in 
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general,  as  we  shall  see  later.  (For  further  background  on  nonmonotonic  logics,  see,  e.g.  the  text 
of  [Schlechta,  1997].)  Nevertheless,  despite  the  differences,  there  are  certain  key  connections 
between  HPL,  CPL  and  EPL  that  will  be  pointed  out. 

Adams  [1975,  1996]  has  ostensibly  shown  already  a  number  of  relations  among  not  only  HPL 
and  CPL,  but  other  logics.  This,  of  course,  does  not  include  EPL,  for  which  Bamber  in  [Bamber, 
2000]  has  shown  basic  connections.  However,  the  thrust  here  is  to  refine  these  results  more  to 
account  for  the  difference  between  the  weak  and  strong  versions  of  these  logics.  In  addition,  this 
section  will  show,  for  the  first  time,  how  a  certain  form  of  “conditional  event  algebra”  (to  be 
explained)  can  be  used  to  provide  a  complete  setting  for  both  elegant  formulations  and 
derivations  of  all  of  the  key  results. 

This  section  provides  only  the  barest  information  necessary  for  the  use  of  conditional  event 
algebra  in  deriving  properties  of  HPL  and  CPL.  Later,  this  will  also  be  useful  in  additional  study 
of  EPL.  For  a  much  more  detailed  presentation,  see  [Goodman  &  Nguyen,  1995],  where  also  a 
history  and  various  characterizations  are  presented  for  the  particular  conditional  event  algebra 
discussed  below. 

To  begin  with,  notice  that  while  the  conditional  probability  P(alb)  appears  to  be  the  natural 
measure  of  uncertainty  or  reliability  corresponding  to  an  inference  rule  “if  b,  then  a”  —  taking 
into  account  the  discussion  in  Section  1.2,  precluding  use  of  the  possible  natural  alternative  form 
P(b=>a))  —  unlike  the  latter,  no  standard  object  or  “conditional  event”  exists  which  can  play  the 
role  of  the  formal  argument  (alb)  of  P  in  the  evaluation  P(alb).  Or  so,  it  seems.  In  fact,  a  further 
apparent  barrier  to  the  existence  of  such  possible  conditional  events  has  been  supposedly 
provided  in  [Lewis,  1976].  Roughly  speaking,  Lewis’  result  states  that  given  any  nontrivial 
probability  space  (G,B,P),  with  P  arbitrary,  one  cannot  have  for  any  0  <  a  <  b  in  B.  at  the  same 
time  some  event,  say  (alb)  also  in  B  with  the  property  that 

For  all  P  over  B  (with  P(b)  >  0),  P((alb))  =  P(alb).  (4.1) 

But,  this  interesting  -  and  readily  proven  —  result  does  not  restrict  (alb)  from  existing  in  a  space 
B0 ,  which  is,  in  an  algebraic  sense,  strictly  larger  than  B,  via  an  isomorphic-isometric 
correspondence  between  any  a,  b,  c,...  in  B  and  (alG),  (bIG),  (cIG)  in  Ba.  That  is,  the  co¬ 
existence  of  (alG),  (blfl),  (cIG),...  and  (alb),  (eld),...  all  in  B0  does  not  violate  Lewis’  “triviality 
result”,  even  though  the  (alG),  (bin),  (cin),...  are  strongly  identifiable  (isomorphically  - 
isometrically)  with  corresponding  a,  b,  c,...  in  B\  In  fact,  the  construction  of  such  conditional 
events  may  be  carried  out  in  the  same  completely  routine  manner  that  events  in  a  product 
probability  space  are  obtained.  (For  further  proof  of  the  avoidance  of  Lewis’  restriction,  see 
[Goodman,  Mahler  &  Nguyen,  1997],  Sections  11.5  and  12.2.2.)  The  basic  product  probability 
space  here,  (G0„B0,P0),  extending  any  given  probability  space  (G  JLP),  is  simply  the  one  formed 
out  of  a  countable  infinity  of  independent  factor  spaces,  each  identical  to  (G,B,P).  In  turn,  given 
any  a,  b  in  B ,  (alb)  is  nothing  more  than  the  formal  algebraic  analogue  of  any  of  its  nontrivial 
numerical  evaluations 


+oo 

P(alb)  =  P(ab)  /  (l-P(b'))  =  £((P(b'  ))J  ■  P(ab))  , 


j=o 


(4.2) 
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where  arithmetic  sum  corresponds  to  disjoint  v,  multiplication  to  cartesian  product  x,  and 
subtraction  to  negation  (.)'.  That  is,  one  can  easily  show  that  if  we  define  the  disjoint  disjunction 

+oo 

(alb)  =d  V  [(b7x(ablQ)],  no  =d  nx  nx  ax... ,  (4.3) 

j=0 

where  (.)Jx(..)  indicates  ordinary  j-fold  cartesian  product  of  (.)  with  itself,  followed  by  the 
cartesian  product  of  this  result  with  (..),  then,  from  now  on,  where  necessary,  indicating  all  basic 
order  relations  and  boolean  operators  over  B0  corresponding  to  the  usual  ones  over  B  by  the 
addition  of  subscript  0, 

(ain)  =o  ax  ao ,  (4.4) 

and  the  basic  consistency  relation  holds 

P0((alb))  =  P(alb),  for  all  P  over  B ,  for  P(b)  >  0.  (4.5) 

Even  more  importantly,  the  well  defined  existence  of  such  conditional  events  (alb),  (eld),...  in  B0 , 
for  any  a,  b,  c,  d,...  in  B  allows  for  natural  extensions  of  conjunction,  disjunction,  negation,  and, 
in  fact  any  boolean  function,  relative  to  ordinary  events  in  B ,  to  conditional  expressions,  and  in 
turn,  then  allow  for  probability  evaluations  of  such  expressions.  First,  the  basic  recursive  form 


that  all  conditional  events  here  must  satisfy  is 

(alb)  =0  (abin)  v  (b'x(alb))  (disjoint)  .  (4.6) 

Introduce  the  conditional-like  operator  [.l..]:50  xB  —>  B0  ,  where  for  any  A  in  B0  and  b  in  B , 
observing  also  a  recursive  form  (with  disjoint  disjunctions) 

+°° 

[Alb]  =d  V  ((b,)’x(A&0(blQ))  =0  A&0(blQ))  v  (b'x  [Alb])  in  B0,  (4.7) 

j=0 

where  it  is  noted  that  for  any  a,  b  in  B  and  A  in  Ba, 

[Alfl]  =0  A  ,  [(alQ)  I  b]  =0  (alb).  (4.8) 

The  corresponding  probability  evaluations  to  eqs.(4.6)-(4.8)  are  simply  for  any  P,  with  P(b)  >  0, 

P(alb)  =  P(ab)  +  P(b')P(alb) ,  P0([Alb])  =  P0(AI  (bin))  =  P0(A&0(bin))  /  P(b) 

=  P0(A&0(bin))  +  P(b/)Po([Alb]),  (4.9) 

etc.  In  turn,  utilizing  the  above  recursive  forms,  it  follows,  for  any  a,  b,  c,  d  in  B , 

(alb)  =0  (ab  I  b),  (alb)'  =0  (alb)  =0  (a'b  lb),  provided  that  b  ^  0,  (4.10) 

(alb)&0(cld)  =0  [A  I  bvd],  A  =d  (abed  in)  v  (abd'x(cld))  v  (cdb'x(alb)),  (4.11) 

whence  if  P(b  vd)  >  0, 
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P0((alb)&0(cld))  =  P0(A)/P(b  vd),  P0(A)  =  P(abcd)  +  P(abd')P(cld)  +  P(cdb')P(alb).  (4.12) 

Note  the  special  cases  of  modus  ponens  and,  generalizing  the  isomorphic  imbedding  relation 
from  B  into  B0,  given  via  a  — >  (alQ),  for  any  a  in  B.  for  common  antecdent  b  in  B  (replacing  Q), 
b  *  0, 

(ac  lb)&0(bd  IQ)  =0  (abed  IQ);  (alb)&0(clb)  =0  (ac  lb),  (alb)v0(clb)  =0(avc  lb).  (4. 12') 

Since  (Q,5,P0)  is  a  legitimate  probability  space  where  all  of  the  standard  laws  of  boolean  (and 
sigma)  algebra,  as  well  as  of  probability,  are  satisfied,  then,  e.g.,  the  standard  modular  expansion 
of  probability  holds 

(4.13) 

Po((alb)v0(cld))  =  P0((alb))  +  P0((cld))  -  P0((alb)&0(cld))  =  P(alb)  +  P(cld)  -  P0(A)/(P(b  vd), 

where  A  is  as  in  eq.(4.11),  etc.  More  generally,  the  following  important  relation  is  to  be  noted  - 
also  readily  derived  from  the  above  recursive  structure  of  conditional  events: 

For  any  finite  index  set  J,  using  multivariable  notation,  for  any  collection  of  aj,  bj  in  B ,  j  in  J, 

&o(alb)j  =d  &G  (ajlbj)  =0  [0&(a,b;J)  I  v(bj)]  ;  (4. 14) 

j  in  J 

v(bj)  =d  v  ( bj) ,  0&(a,b;J)  =d  V0(y(a,b;K,J)  x  &0(alb)j-,K)  (disjoint  disjunction);  (4.15) 

jin  J 

y(a,b;K,J)  =d  &  ( ajbj)&  &  (  b,') ;  y(a,b;J,J)  =  &  (ajbj) ;  y(a,b;0,J)  =  &  (b/)  =d  &(b')j ;  (4.16) 

jinK  i  in  J— iK 

for  any  index  sets  0cKcJ.  In  turn,  the  probability  evaluation  of  the  conjunction  in  eq.(4.14) 
is 

Po(&o(alb)j)  =  Po(0&(a,b;J))  /  P(v(bJ));  Po(0&(a,b;J))  =  £(P(  y(a,b;K,J))  P0(&0(alb)^K)),  (4.16') 

(0^KcJ) 


noting,  that  as  in  its  algebraic  counterpart  in  eq.(4.15),  the  full  evaluation  of  P0(&0(alb)j) 
proceeds  recursively,  where  the  factors  P0(&0(alb)j^K))  are  evaluated  similarly  with  J  replaced  by 
J-iK,  until  no  more  than  single  conditional  probability  terms  appear.  Note  also  the  tie-in  of  the 
y’s  above  with  the  related  disjoint  expansion  of  the  conjunction  of  the  material  conditional 

&(b=>a)j  =d  &  ((bj=>aj))  =  v  (y(a,  b;  K,  J))  =  &(b^a)J&(v(b)J)  v  &(b')j  (disjoint),  (4. 17) 

JlnJ  (0*KcJ) 

where 

&(b=>a)j&(v(b)j)  =  &(b=>a)j&(v(a)j)  =  v (y(a,b;K,J))  (disjoint).  (4.18) 

jin  J 

Next,  it  follows  that  for  any  proper  (or  nontrivial)  conditional  events  (alb),  (cld),  i.e.,  0  <  a  <  b 
in  B,  0  <  c  <  d  in  B  (again,  see  [Goodman  &  Nguyen,  1995]), 

(alb)  <0  (cld)  iff  (alb)  =0  (alb)&0(cld)  iff  (cld)  =0  (alb)  vG(cld) 
iff  (alb)&0(cld)/-  =0  0  iff  (alb)&0(c/ld)  =0  0O 
iff  (a  <  c  and  b=>a  <  d=>c  )  iff  (a  <  c  and  c'd  <  a'b  ) 
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iff  (  b=>a  <  (bvd)=>cd  )  iff  (c'd  v  d'b  <  a'b) 

iff  for  all  P  over  5,  with  P(b),  P(d)  >  0,  P(alb)  <  P(cld),  (4.19) 

(alb)  =0  (cld)  iff  ( (alb)  <0  (cld)  and  (cld)  <0  (alb) )  iff  (a  =  c  and  a'b  =  c'd) 
iff  (a  =  c  and  b  =  d) 

iff  for  all  P  over  B ,  with  P(b),  P(d)  >  0,  P(alb)  =  P(cld).  (4.20) 

In  a  related  direction  (see  also  the  analogue  in  eq.(1.2.6)),  any  conditional  event  (alb)  in  B0 
satisfies 

(ab  IQ)  <0  (alb)  <0  (b=>a  I  Q)  and  for  P(b)  >  0,  P(ab)  <  P(alb)  <  P(b=>a).  (4.20') 

Note  also  that  eq.(4.3)  shows 

(al0)  =0  0O  ;  if  b  >  0,  (bib)  =Q  Q0  (essentially),  (4.21) 

the  only  possible  improper  (or  trivial)  conditional  events,  and  for  any  P  over  B. 

P(b)  =  0  implies  P0((alb))  =  0  ,  P(b)  >  0  implies  P0((blb))  =  1,  (4.22) 


etc.  (the  opposite  of  Adams’  interpretation,  recalling  the  discussion  in  Section  3  and  earlier). 

From  now  on,  we  will  use  the  term  product  space  conditional  event  algebra  (PSCEA)  to  refer  to 
the  product  probability  space  (Q0,50,P0)  extending  (QJ3,P)  in  the  above  isomorphic-isometric 
sense  of  any  a  in  B  corresponding  to  (alb)  in  B0  isomorphically  and  for  any  P,  P0((alQ))  =  P(a), 
together  with  the  conditional  event- forming  structure  a,  b  — >  (alb),  etc. 

We  will  also  need  to  imbed  an  important  operator  developed  independently  in  [Adams,  1975, 
1996]  and  in  [Calabrese,  1987,  1994],  often  called  “quasi-conjunction”,  since,  as  important  an 
operator  as  it  will  be  seen  later  it  is,  it  is  not  only  non-boolean  in  structure,  but  fails  to  form  a  full 
lattice  operation  with  its  DeMorgan  dual  [Goodman  &  Nguyen,  1995].  On  the  other  hand,  this 
operator  appears  to  produce  a  sort  of  conjunction  that  at  times  may  be  the  appropriate 
interpretation  of  “and”  in  a  conditional  setting.  (See,  again  [Goodman  &  Nguyen,  1995]  for 
further  details.)  In  any  case,  the  appropriate  definition  in  the  PSCEA  setting  for  this  non-boolean 
operator  is,  for  given  probability  space  (Q,7?,P)  and  any  events  0  <aj  <  b,  in  B.  j  in  J  (any  finite 
index  set)  producing  proper  conditional  events  (ajlbj)  in  Bt).  j  in  J,  is  simply  the  direct  (non- 
associative)  one,  using  the  multivariable  notation  from  eqs.(4.14),  (4.16), 

&Ac(alb)j  =d  &AC  (ajlbj)  =d  (&(b=>a)j  I  v(bj)) 


=  ((&(b=>a)j  &(v(bj))  I  v(bj))  in  B0.  (4.23) 

Note  that  this  version  of  &ac  in  PSCEA  is  always  well  defined  since  its  domain  consists  of 
proper  conditional  events  and  thus  the  identification  property  in  eq.(4.20)  can  be  used  to  test 
equality  of  other  proper  conditional  events  with  those  produced  by  &Ac,  under  the  conditionas 


20 


that  the  latter  produces  a  proper  conditional  event.  Indeed,  the  only  improper  conditional  event 
that  &Ac(alb)j  can  assume  in  general  is  0O,  since 

(&AC(alb)j)r°  =  (v(a'b)j  I  v(bj))  ,  v(a'b)j  =d  (  v  (a/bj) ,  (4.24) 

j  in  J 

and  since  v(bj)  >  0  always  is  assumed, 


&Ac(alb)j  =  ao  iff  (v(a'b)j  I  v(bj))  =  0O , 

which  is  impossible,  since  by  the  proper  conditional  event  assumption,  each  a/bj  >  0,  hence 
v(a'b)j  >  0.  For  the  case  of  J  ={  1 } ,  &ac(-)j  reduces  to  the  usual  identity  operator,  in  common 
with  that  of  &0(-)j  and  v0(.)j: 

&Ac(alb)j  =  &ac  (ailbO  =  (a.lb,).  (4.25) 

Clearly,  when  comparing  the  forms  in  eqs.(4.14),  (4.15),  (4.18),  and  (4.23),  &Ac  and  &Q  differ  in 
that  each  corresponding  term  of  &Q  has  also  a  cartesian  product  factor.  Hence,  it  readily  follows 
that  it  is  always  true  that 

&0(alb)j  <  &AC(alb)j .  (4.26) 

Note  that  the  probability  evaluation  of  &Ac(alb)j,  analogous  to  the  expansion  in  eq.(4.16/),  is 

P0(&AC(alb)j)  =  P(A(a,b;J))  /  P(v(bj ));  P0(A(a,b;J))  =  £(P(  y(a,b;K,J))),  (4.27) 

(0*KcJ) 

which,  in  general,  can  be  considerably  larger  than  the  counterpart  P0(&0(alb)j). 

In  turn,  we  next  state  some  important  connections  between  ordering  with  respect  to  PSCEA 
conjunction  and  AC  conjunction.  From,  now  on,  for  simplicity,  we  omit  subscript  o  from 
ordering  and  equality  relations  and  negation  in  PSCEA,  but  retain  it  for  conjunction  (and 
disjunction),  in  order  to  distinguish  it  from  &Ac,  which  will  be  widely  used. 

Theorem  4.1.  Ordering  relations  between  &0  and  &Ac 

For  (OJ3,P)  a  given  probability  space  with  PSCEA  extension  (00,50,Po),  P  arbitrary,  and  any 
proper  conditional  events  (aj  I  bj)  in  B0,  j  in  J  (finite),  (cld)  in  B0, 

(i)  &0(alb)j  <  (cld)  iff  Or  (&AC(alb)K<  (cld)). 

(0*KcJ) 

(ii)  not(&0(alb)j  <  (cld))  iff  And  ( not(  &Ac(alb)K  <  (cld) ) ) 

(0*KcJ) 

iff  &0(alb)j  &0(c/ld)  *  0O 

iff  And  (  &Ac(alb)K  *  0O)  and  And  (  &Ac((alb)K  ,  (c'ld))  *  0O )  . 

(0*KcJ)  (0#KcJ) 

(iii)  &0(alb)j  =  0O  iff  Or  (&Ac(alb)K=  0O). 

(0*KcJ) 

(iv)  &0(alb)j  ^  0O  iff  And  (&AC(alb)K^  0O). 

(0^KcJ) 

Proof:  (i)  is  the  same  as  Lemma  10  in  [B  amber  el  al.,  2000]. 
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(iii)  is  straightforward  via  cases  of  card(J)  =  1,  2,  3.  One  can  precede  analogously  for 
higher  values  of  card(J). 

(ii)  and  (iv)  follow  logically  from  (i)  and  (iii).  ■ 

5.  Additional  Analysis  of  HPL  and  CPL:  Use  of  Algebraic  Characterizations  of 
Particular  Probability  Relations 

While  in  Section  4  a  number  of  algebraic  properties  were  developed  directly  related  to  PSCEA, 
in  this  part,  relations  are  exhibited  for  the  most  part  between  algebraic  descriptions  and 
probability  bounds.  This  is  motivated  by  the  following  definitions,  which  refine  and  carefully 
delineate  between  the  weak  and  strong  versions  of  HPL  and  CPL.  More  specifically,  letting,  as 
usual,  (£2,B,P)  be  a  given  probability  space  with  probability  measure  P:5— >[0,1]  arbitrarily 
variable,  (Q.0,B0, P0)  its  PSCEA  extension,  J  a  finite  index  set  and  events  0  <  a,  <  b,  in  //.  j  in  J,  0 
<  c  <  d  in  B,  using  the  multivariable  notation  as  previously  developed  whenever  possible  so  that 
in  the  following,  e.g.,  (alb)j  represents  the  premise  set  of  conditional  events  (ajlbj)  (or 
unconditional/  ordinary  events  whenever  bj  =  Q.)  and  (cld)  represents  the  single  (for  purpose  of 
simplicity)  potential  conclusion  conditional  event  (or  unconditional  event,  when  d  =  Q). 

By  convention,  let  us  call  the  collection  of  above  assumptions,  Basic  Assumption  I. 

Lor  general  background,  we  again  refer  to  [Adams,  1966,  1975,  1986,  1996;  Goodman  & 
Nguyen,  1998;  Goodman,  1999;  and  Bamber  et  at.,  2000]. 

Definition  1.  Say  that  strong  high  probability  deduction  (or  logic)  (SHPL)  holds  with  respect  to 
((alb)j;(cld))  (or  that  (alb)j  deduces  (cld)  in  the  SHPL  sense ,  etc.),  written  symbolically  as 

(alb)j  <shpl  (cld),  (5.1) 

limit(  minconc((alb)j;(cld))(tj) )  =  1,  (5.2) 

tjTij 

(for  any  0  <  £  <  l)(there  is  a  0  <  SE  <  l)(for  any  P) 

(if  [P(alb),  >  1-5,)],  then  [P(cld)  >  l-£]),  (5.3) 

noting  all  conditional  probabilities  are  in  the  ordinary  sense,  i.e.,  P(b)j  >  0j. 

Definition  2.  Say  that  strong  high  probability  (SHPL)  consistency  holds  with  respect  to  (alb)j  iff 
the  “if-part”  of  eq.(5.3)  is  nonvacuously  satisfied  for  all  possible  threshold  levels,  i.e., 

(for  any  0  <  8  <  1)(  there  is  a  P5)(  P5(alb)j  >  1-8),  (5.4) 

Definition  3.  Say  that  weak  high  probability  deduction  (or  logic)  (WHPL)  holds  with  respect  to 
((alb)j;(cld)),  (or  that  (alb)j  deduces  (cld)  in  the  WHPL  sense ,  etc.),  written  symbolically  as 


iff 

i.e., 


(alb)j  <whpl  (cld) 


(5.5) 


22 


iff  in  the  expressions  in  eqs.(5.2)  or  (5.3)  we  allow  possibly  some  of  the  P(bj)  to  be  0  and 
formally  interpret  P(ajlbj)  =  1,  and  similarly  for  P(cld),  i.e., 

(for  any  0  <  £  <  l)(there  is  a  0  <  SE  <  l)(for  any  P) 

(if  [(for  each  j  in  J  )(either  P(ajlbj)  >  1-SE  or  P(bj)  =  0)],  then 

[either  P(cld)  >  l-£  or  P(d)  =  0  ]).  (5.6) 

Definition  4.  Say  that  weak  high  probability  (WHPL)  consistency  holds  with  respect  to  (alb)j  iff 
the  “if-part”  of  eq.(5.6)  is  nonvacuously  satisfied  for  all  possible  threshold  levels,  i.e., 

(for  any  0  <  8  <  1)(  there  is  a  P5)(  for  each  j  in  J)(either  P5(ajlbj)  >1-8  or  P8(bj)  =  0).  (5.7) 

Definition  5.  Say  that  strong  certainty  probability  (SCPL)  deduction  (or  logic)  holds  with 
respect  to  ((alb)j;(cld))  (  or  that  (alb)j  deduces  (cld)  in  the  SCPL  sense ,  etc.),  written  symbolically 


as 

(alb)j  <scpl  (cld), 

(5.8) 

iff 

minconc((alb)j;(cld))(lj) )  =  1, 

(5.9) 

i.e., 

(for  any  P)(if  [P(alb)j  =1],  then  [P(cld)  =1]). 

(5.10) 

Definition  6.  Say  that  strong  certainty  probability  (SCPL)  consistency  holds  with  respect  to 
(alb)j  iff  the  “if-part”  of  eq.(5.10)  is  nonvacuously  satisfied  at  threshold  level  1,  i.e., 

(there  exist  P)(  P(alb)j  =1).  (5.1 1) 

Definition  7.  Say  that  weak  certainty  probability  (WCPL)  deduction  (or  logic)  holds  with 
respect  to  ((alb)j;(cld))  (  or  that  (alb)j  deduces  (cld)  in  the  WCPL  sense ,  etc.),  written  symbolically 
as 

(alb)j  <wcpl  (cld),  (5.12) 

iff  in  the  expressions  in  eqs.(5.9)  or  (5.10)  we  allow  possibly  some  of  the  P(bj)  to  be  0  and 
formally  interpret  P(ajlbj)  =  1,  and  similarly  for  P(cld),  i.e., 

(for  any  P)(if  [(for  each  j  in  J  )(either  P(ajlbj)  =  1  or  P(bj)  =  0)],  then 

[either  P(cld)  =  1  or  P(d)  =  0  ]).  (5.13) 

Definition  8.  Say  that  weak  certainty  probability  (WCPL)  consistency  holds  with  respect  to 
(alb)j  iff  the  “if-part”  of  eq.(5.13)  is  nonvacuously  satisfied  at  threshold  level  1,  i.e., 

(there  exists  P)(  for  each  j  in  J)(either  P(ajlbj)  =  1  or  P(bj)  =  0).  (5.14) 

Remarks.  By  the  very  definitions  above,  it  follows  immediately  that: 

(i)  SHPL  consistency  implies  WHPL  consistency. 
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(ii)  SCPL  consistency  implies  WCPL  consistency. 

(iii)  The  remaining  relations  among  the  various  concepts  will  be  examined  below. 

The  next  results  allow  us  to  determine  for  the  most  part  algebraically  when  any  of  the  above 
concepts  hold  true. 

Theorem  5.1  Characterization  of  SHPL  consistency.  (Extension  of  [Adams,  1975]) 

Under  Basic  Assumption  I,  the  following  statements  are  equivalent: 

(i)  SHPL  consistency  holds  with  respect  to  (alb)j. 

(ii)  &0(alb)j  *  0O. 

(iii)  And  (&Ac(alb)K^  0O). 

(0*KcJ) 


(iv)  There  is  a  positive  integer  M  and  an  exhaustive 
such  that,  using  the  notation  of  eq.(4.16),  letting 

nonvacuous  partitioning  [Ki,. 

..,Km]  of  J 

s 

cT 

if 

-a 

ii 

0 

K(0)  =d  0  , 

(5.15) 

y(a,b;Kj,J— iK(j))  *  0,  for  j  =  1,.. 

„M. 

(5.16) 

Note  that  necessarily  for  the  Mth  term,  K(M)  =  0  and 

y(a,b;KM,0)  =  &(aKJ  *  0-  (5.17) 

(v)  There  is  a  positive  integer  M  and  an  exhaustive  nonvacuous  partitioning  {Ki,...,Km}  of  J 
such  that,  using  the  notation  of  eq.(5.15), 


&Ac(alb)j^K(j)  *  0,  for  j  =  0,  1,...,  M-l. 


(vi)  And  (  Or  ( y(a,b;L,K)  ^  0)). 

(0*KcJ)  0^LcK) 


(vii)  And  (&(b  =>  a)K  &  (v(a)K )  =£0). 

(0*KcJ) 


Proof:  (iii)  iff  (vi)  iff  (vii):  This  follows  immediately  from  definition  of  &ac,  etc. 

(iv)  iff  (v)  holds  due  to  the  basic  structure  of  &Ac  (see  Section  4). 

(ii)  iff  (iii)  follows  directly  from  Theorem  4.1  (iv). 

(ii)  implies  (iv):  Using  eqs.(4.14)-(4.16),  (ii)  shows  that  for  some  0  ^  Ki  cJ,  there  is  a 
term  y(a,b;Ki,J)  x  &0(alb)j^Kl  0O  in  the  consequent  0&(a,b;J)  of  &0(alb)j  .  Thus,  both  sides  of 
the  cartesian  product  must  be  non-null,  whence  &0(alb)j^Kl  =£  0.  But,  next,  replacing  J  in  the 
above  reasoning  process  by  J— iK  ,  we  next  obtain  some  0  ^  K2  e  J— iKj  and  a  term  y(a,b;K2, 
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J— iKj— iK2)  x  &Q(alb)  j-.Kj-.Kj  =£  0O  •  We  continue  the  process  until  some  M  is  found  so  that 
(guaranteed)  J  — iKj  ...  — iKm  =  0. 

(iv)  implies  (i):  This  follows  the  guidelines  of  Adams’  approach  [1996],  where  since  it  is 
readily  verified  that  the  y(a,b;Kj,J— iK(j))  ^  0,  for  j  =  1,...,  M  ,  are  all  mutually  disjoint  subsets  of 
v(b)j,  for  any  given  real  d,  0  <  8  <  1,  we  construct  a  P8  by  assigning 

Ps(Y(a,b ;Kj , J— iK(j ))  =d  -  &  ,  for  j  =  1,...,  M-l  ; 

P5(Y(a,b;KM,0)  =d  8M_1  ,forj=M.  (5.18) 

Then,  since  for  any  Kj,  and  all  i  in  Kj,  b,  is  disjoint  from  &(b')j-,K(j-n ,  the  latter  being  > 

j-i 

v(  Y(a,b;Kk,J-iK(k))),  it  follows  that 

k=l 

P5(bi)  <  1  -  Ps(  v(Y(a,b;Kk,J^K(k)))  )  =  1  -  V(Sk-'  -  8k)  =  8*'1,  j  =  1,..,M.  (5.19) 

k=1  k=l 

On  the  other  hand,  since  for  any  i  in  Kj, 

a;  >  &(a)Kj  >  Y(a,b;Kj,J— iK(j))  >  0, 

P»(aO  >  P5(Y(a,b;Kj,J^K(j)),  for  j  =  1,...,  M.  (5.20) 

Thus,  combining  eqs.(5.18)-(5.20)  shows,  for  all  i  in  Kj, 

P(ailbi)  >  (8i_1  -  5s)  /  5s  1  =  1-8,  for  j  =  1,...,  M.  (5.21) 

(i)  implies  (ii):  Since  all  laws  of  probability  apply  to  PSCEA,  the  FHH  lower  bound  in 
eq.(1.1.5)  holds  applied  to  unconditionals  replaced  by  conditionals  and  P  by  PQ,  i.e.,  assuming 
P(bj)  >  0,  all  j  in  J, 

max(  V  (P(  P(ajlbj)  -  (card(J)-l),  0)  <  P0(&o(alb)j)  <  min  (P(ajlbj)).  (5.22) 

jmJ 

jin  J 

(i)  then  implies  for  every  real  0  <  d  <  1,  there  is  a  P5  with  P5(ajlbj)  >  1-8,  for  all  j  in  J,  which 
combined  with  eq.(5.22),  for  all  8,  0  <  8  <  l/card(J),  shows 

0  <  1  -  S-card(J)  =  card(J).(l-8)  -  (card(J)  -  1)  <  £(P(  a, lb,))  -  (card(J)  -1) 

jin  J 

<  P0(&0(alb)j),  (5.23) 

which  certainly  implies  that  &Q(alb)j  ^  0O.  ■ 

Theorem  5.2.  Characterization  of  SHPL  deduction.  (Extension  of  [Adams,  1975]) 

Under  Basic  Assumption  I  and  SHPL  consistency,  the  following  statements  are  equivalent: 
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(i)  (alb)j  <shpl  (cld). 

(ii)  &0(alb)j  <  (cld). 

(iii)  Or  (&AC(alb)K<(cld)). 

(0#KcJ) 

Proof:  By  Theorem  4. l(i),  (ii)  iff  (iii). 

not(ii)  implies  not(i):  Suppose  not(ii).  By  Theorem  4.1  (ii),  not(ii)  is  equivalent  to 

&0(alb),  &0(c'ld)  *  0O.  (5.24) 

But,  by  Theorem  5.1,  where  we  assume  o  not  in  J,  letting  (a0lb0)  =d  (cld)  and  J0  =d  Ju{o}, 
replacing  J  there  by  J0: 

For  each  real  8,  0  <  8  <  1,  there  is  a  P5  such  that  for  all  j  in  J0,  P(ajlbj)  >  1-8, 

i.e.,  P5(ajlbj)  >  1-8,  for  all  j  in  J,  and  P5(c'ld)  >  1-8,  i.e.,  P(cld)  <  8.  (5.25) 

Thus,  the  results  in  eq.(5.25)  clearly  shows  not(i). 

(ii)  implies  (i):  Simply  apply  the  FHH  inequality  in  a  PSCEA  setting,  as  in  eqs.(5.22), 
(5.23).  "  ■ 


Theorem  5.3.  Characterization  of  WHPL  consistency. 

Under  Basic  Assumption  I,  the  following  statements  are  equivalent: 

(i)  WHPL  consistency  holds  with  respect  to(alb)j. 

(ii)  &(b=>a)j  *  0  . 

(iii)  v(a'b)j  ^  Q.  . 

Proof:  (ii)  iff  (iii)  is  immediate. 

(ii)  implies  (i):  From  eq.(4.17),  (ii)  implies  there  is  some  K,  0  ^  K  c  J,  so  that  y(a,b;K,J) 
^  0.  Then,  pick  any  P  such  that  P(y(a,b;K,J))  =  1.  This  immediately  implies  that  (i)  is  satisfied. 

not(ii)  implies  not(i):  Suppose  not(ii).  First,  consider  any  probability  measure  P  and  KP 
=d  {j  in  J:  P(bj)  =  0}.  If  KP  =  J,  then  P(v(b)j)  <  Z(P(bj))  =  0,  implying  P(v(b)j)  =  0.  But,  by 
not(ii),  &(b=>a)j  =  0O  and  hence  v(a'b)j  =  Q.,  implying  v(b)j  =  Q.,  contradicting  the  above 
probability  evaluation.  Thus,  we  must  always  have  0  cz  KP  a  J  (proper),  and  hence,  0  ^  J— \KP. 
In  turn,  note  that  by  the  definition  of  KP. 

1  =  P(Q)  =  P(v(a/b)J)  =  P(v(a'b)j^p).  (5.26) 
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Now,  suppose  (i)  were  true.  In  particular,  choose  any  real  8,  0  <  8  <  l/(l+card(J))  and  any  P8 
satisfying  P5(ajlbj)  >  1-8  or  P(bj)  =  0,  for  any  j  in  J.  Thus,  P5(bj)  =  0,  for  all  j  in  KP  and  P(ajlbj)  > 
1-8,  for  all  j  in  J— iKP.  The  latter  is  the  same  as 

P5(a/bj)  <  (S/(l-S))P(aj),  for  all  j  in  J-,KP.  (5.27) 

Combining  eqs.(5.26)  and  (5.27), 

1  =  P(v(a'b)j^p)  < Z(P6(ab) .,-*•„)  <  (8/(l-8))-E(P#(a)  j^„) 

<  (8/(l-8))-card(J— i^fp)  <  (S/(l-S))-card(J)  <  1, 

a  contradiction.  Hence,  not(i)  must  hold.  ■ 

Theorem  5.4.  Under  SHPL  consistency,  WHPL  deduction  implies  SHPL  deduction. 

Under  Basic  Assumption  I  and  SHPL  consistency, 

(alb)j  <whpl  (cld)  implies  (alb)j  <Shpl  (cld). 

Proof:  Suppose  not  (SHPL).  Then,  eq.(5.25)  in  the  proof  of  Theorem  5.2  clearly  shows  a 
violation  of  both  SHPL  and  WHPL.  ■ 


Theorem  5.5. 

Under  the  Basic  Assumption  I,  if  (alb)j  is  SHPL  consistent  and  (alb)j  <shpl  (cld),  then 

&((b=>a)j)  <  d=>c.  (5.28) 

Proof:  By  hypothesis,  using  Theorem  5.1, 

0*  &0(alb)j  <  (cld).  (5.29) 

Hence,  by  Theorem  4.1(iv),  (i), 

And  ( &Ac(alb)K  ^  0)  and  (there  exists  Ki)(0^KicJ)(&Ac(alb)K  <  (cld)).  (5.30) 

(0#KcJ) 


Now,  as  pointed  out  in  Section  4,  under  Assumption  I,  for  any  K,  0^KcJ,  &Ac(alb)K  ^  fl0.  On 
the  other  hand,  the  left-hand  side  of  eq.(5.30)  shows  neither  can  any  &ac  result  be  null  either, 
i.e.,  we  must  have,  in  particular,  &Ac(alb)Kl  being  a  proper  conditional  event.  Hence,  the  basic 
ordering  criterion  for  PSCEA  in  eq.(4.19)  can  be  invoked  to  characterize  the  right-hand  side  of 
eq.(5.30).  Thus,  noting  again  from  Section  4,  the  structure  of  &AC,  we  obtain  from  the  right- 
hand  side  of  eq.(5.30) 

&AC(alb)  K[  =w  (AIB)  <  (cld)  iff  [A  <  c  and  B=>A  <  d^c  ]  ,  (5.31) 
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where 


A  =d  &((b=>a)K,)&  B  ,  B  =dv(bK|). 


In  turn,  part  of  the  right-hand  side  of  eq.(5.31)  implies 

&(b=>a)j  <  &(b=^a)iq  =  B=>A  <  d=>c, 

the  desired  result.  ■ 

Theorem  5.6.  Characterization  of  WHPL  deduction,  (a  variation  of  [Adams,  1986,  1996]) 
Under  Basic  Assumption  I  and  WHPL  consistency,  the  following  statements  are  equivalent: 

(i)  (alb)j  <whpl  (cld)  . 

(ii)  Or  ([&(b  =>  a)K  &  (v(b),  <  c]  and  [&(b  =>  a)K  <  (d  =>  c)]) . 

0#KcJ) 

(iii)  Or  (  [(&(b  =>  a)K  &  (v(b).  =  0)  and  (&(b  =>  a)K  <  (d  =>  c))] 

0*KcJ) 

or  [0  *  &Ac(alb)K  <  (cld)]  ) 

Proof:  [Adams,  1986]  provides  an  equivalent,  but  different- appearing  formulation  and  proof  of 
the  above  theorem.  A  complete  self-contained  proof  here  is  given  in  Appendix  A.  ■ 

Remark  1. 

Improving  upon  Theorem  5.4,  Theorem  5.6  shows  directly  that  under  (alb)j  having  SHPL 
consistency,  WHPL  and  SHPL  deduction  of  (cld)  from  (alb)j  are  equivalent.  It  also  shows 
directly  that  under  WHPL  consistency,  WHPL  deduction  implies  WCPL  deduction  of  (cld)  from 
(alb)j.  See  also  the  summary  of  consistency  and  deduction  relations  in  Theorem  5.11. 

Remark  2. 

As  an  illustration  of  weak  vs.  strong  HPL  deduction,  consider,  e.g.,  the  case  of  J  =  { 1,2],  where 
for  purpose  of  nontriviality,  we  assume  that  (alb)j  is  not  SHPL  consistent,  but  is  WHPL 
consistent,  i.e.,  from  the  above  theorems, 

&0(alb)j  =  0O  and  &(b=>a)j  ^  0.  (5.32) 

Using  eqs.(4.11),  (4.16),  (4.17),  it  follows  that  eq.(5.32)  (under  our  basic  Assumption  I)  is 
equivalent  to 


aia2  =  aib2'  =  a2bi'  =  0,  b/b?'  =£  0.  (5.33) 

Consider  next  the  possible  situations  with  respect  to  any  real  8,  0  <  8  <  1  and  P5  such  that  the 
weak  consistency  condition  at  8  is  satisfied,  i.e.,  either  P6(ajlbj)  >  1-  8  or  P5(bj)  =  0,  j  =  1,  2. 
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Case  1:  P5(ajlbj)  >  1-  8, j  =  1,  2.  But,  by  appealing  to  the  FHH  lower  bound  (eq.(5.22)),  it  is  clear 
that  for  d  sufficiently  small,  we  would  have  (P5)0((ailbi)&0(a2lb2))  large,  contradicting  the  left- 
hand  side  of  eq.(5.32),  where  the  value  should  be  zero. 

Case  2:  P5(ailbi)  >1-8  and  P5(b2)  =  0.  But,  eq.(5.33)  shows  that  since  P5(b;/)  =  1,  P5(ai)  = 
P5(aib2/)  =  P5(  0)  =  0,  contradicting  P5(ailbi)  >1-8  above. 

Case  3:  P5(a2lb2)  >1-8  and  P5(b i )  =  0.  This  yields,  dually,  the  same  contradiction  as  in  Case  2. 
Case  4:  This  is  the  only  remaining  possibility:  P5(bi)  =  P5(b2)  =  0,  i.e.,  Pdbi'b2')  =  1. 

Thus,  so  far, 

(alb)wHPL  ^  (cld)  iff  (for  all  real  £)(0<£<1)  (there  is  a  real  S)(0<8<l)(for  all  P) 

(if  P(bi/b2/)  =  1,  then  either  P(cld)  >  l-£  or  P(d)  =  0).  (5.34) 

Situation  1:  bi'h/c'd  ^  0.  But,  pick  any  P  such  that  P(bi/b2/c/d)  =  1,  thus  showing  not((alb)j 
— whpl  (cld) )  here. 

Situation  2:  bi'b/  <  d=>c,  the  only  remaining  possibility,  which  obviously  from  the  constraint  on 
possible  P’s  works,  i.e.,  for  any  P  satisfying  the  “if-part”  of  eq.(5.34),  P(d=>c)  =  1,  i.e.,  either 
P(d)  =  0  or  P(cld)  =  1. 

Hence,  in  summary,  for  J  =  { 1,2}  and  (alb)j  being  WHPL  consistent,  but  not  SHPL  consistent, 

(alb)j  <whpl  (cld)  iff  bi'b/  <  d=>c  ,  (5.35) 

with  no  proper  conditional  probabilities  involved  when  the  deduction  holds. 


Lemma  5.1  A  useful  characterization. 

Here,  assume  a  probability  space  (£2,B,P)  present,  with  P  variable,  J  a  finite  index  set,  0  ^  e,,  f  in 
B ,  j  in  J.  Then,  the  following  two  statements  are  equivalent: 

(i)  (There  is  a  P)(P(f)  =  1  and  for  all  j  in  J,  P(ej)  >  0). 

(ii)  And(  e,f  ^  0  ). 

jin  J 

Proof:  Straightforward.  ■ 

Theorem  5.7.  Characterization  of  WCPL  consistency. 

Under  the  Basic  Assumption  I,  the  following  statements  are  equivalent: 

(i)  (alb)j  is  WCPL  consistent. 

(ii)  &(b=>a)j*0. 
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Proof:  (ii)  implies  (i):  Suppose  (ii)  holds.  Let  P  be  such  that  P(&(b=>a)j)  =  1.  This  yields 
P(bj=>aj)  =  1,  and  hence  P(ajlbj)  =  1  or  P(bj)  =  0,  as  required,  for  all  j  in  J. 

not(ii)  implies  not(i):  Suppose  not(ii)  holds.  Then,  it  is  impossible  to  find  any  P  such  that  for  all 
j  in  J,  either  P(bj)  =  0  or  P(ajlbj)  =  1,  since  this  would  imply  P(bj=>aj)  =  1  and  hence  P(&(b=>a)j) 
=  1,  implying  (&(b=>a)j)  ^  0,  contrary  to  the  assumption.  ■ 

Theorem  5.8.  Characterization  of  WCPL  deduction.  [Adams,  1996] 

Under  the  Basic  Assumption  I  and  WCPL  consistency  for  (alb)j,  the  following  statements  are 
equivalent: 

(i)  (alb)j  <wcpl  (eld). 

(ii)  (&(b=>a)j)  <  d=>c. 

Proof:  The  proof  is  basic  and  has  appeared  in  a  number  of  places.  However,  for  completeness, 
we  present  a  brief  outline. 

(ii)  implies  (i):  Suppose  (ii).  Then  for  any  P  such  that  P(bj)  =  0  or  P(ajlbj)  =  1,  i.e., 
P(bj=>aj)  =  1,  all  j  in  J,  implying  1  =  P((&(b=>a)j)  <  P(d=>c). 

not(ii)  implies  not(i):  Suppose  not(ii).  Then,  (&(b=>a)j)&c'd  ^  0,  and  by  choosing  P,  such 
that  P((&(b=>a)j)&c'd)  =  1,  we  easily  see  that  not(i)  holds.  ■ 

Theorem  5.9.  Characterization  of  SCPL  consistency. 

Under  Basic  assumption  I,  the  following  statements  are  equivalent: 

(i)  (alb)j  is  SCPL  consistent. 

(ii)  And(  bi&  (&(b=>a)j)  *  0). 

i  in  J 

(iii)  And(  a;&  (&(b=>a)j)  ^  0). 

i  in  J 

(iv)  And(  aj— i  (&(b=>a)j)  ^  0). 

i  in  J 

Proof:  (iii)  and  (iv)  are  obviously  equivalent. 

(ii)  implies  (i):  Suppose  (ii).  Then,  using  Lemma  5.1,  with  e;  =  bj  and  f  =  &(b=>a)j,  i  in  J, 
there  is  a  P  such  that  P(&(b=>a)j)  =  1  and  P(b;)  >  0,  i  in  J.  This  is  sufficent,  as  similar  reasoning 
in  previous  proofs  show,  to  insure  that  (i)  holds. 

not(ii)  implies  not(i):  Suppose  not(ii).  Then,  there  is  an  i  in  J  such  that  bj&(&(b=>a)j)  = 
0.  Hence,  for  any  P  so  that  P(ajlbj)  =  1,  all  j  in  J,  implies  P(&(b=>a)j)  =  1,  and  thus  implies  by 
the  above  disjointness  that  P(b;)  =  0,  a  contradiction  to  P(ailbj)  =  1.  Thus,  not(i)  must  hold. 

(iii)  implies  (ii):  Obvious,  since  each  a;  <  bj  . 

not(iii)  implies  not(i):  Suppose  not  (iii).  Then,  there  is  some  i  in  J  with  aj&(&(b=>a)j)  = 
0.  Hence,  if  there  is  some  P  so  that  P(ajlbj)  =1,  all  j  in  J,  again  this  implies  P(&(b=>a)j)  =  1, 
implying,  in  turn,  from  the  above  disjointness  condition,  that  P(a;)  =  0,  contradicting  P(ajlbj)  =  1. 
Thus,  not(i)  must  hold.  ■ 
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Theorem  5.10.  Characterization  of  SCPL  deduction. 


Under  Basic  Assumption  I  and  SCPL  consistency  holding  for  (alb)j,  the  following  statements  are 
equivalent: 

(i)  (alb)j  <scpl  (cld)  . 

(ii)  Or(  &(b=>a)j  <  (b,vd)=>c)  )  . 

i  in  J 


Proof:  not(i)  holds  iff  (there  is  a  P)(  (P(alb)j=  1  )  and  (either  P(d)  =  0  or  P(cld)  <  1)) 
iff  (there  is  a  P)(  (P(&(b=>a)j)  =  1)  and  (P(bj)  >  0j )  and 
(either  P(d')  =  1  or  P(c'd)  >  0)  ) 

iff  ([(there  is  a  P)(  (P(d'&(&(b=>a)j))  =  1 )  and  (  P(bj)  >  0j))]  or 
[(there  is  a  P)(  (P(&(b=>a)j)  =  1)  and  (P(bj)  >  0j)  and  (P(c'd)  >  0)  )] 
iff,  using  Lemma  5.1  twice,  with  at  first  f  =  d'&(&(b^>a)j)  and  e;  = 
b;,  and  then  f  =  &(b=>a)j,  e;  =  b;,  eQ  =  c'd.  by  extending  index  set  J 
to  include  o  corresponding  to  c'd,  etc., 

[  And(  bid'&l&lb^a);)  *  0)  ] 

i  in  J 

or  [  And(bi&(&(b^a)j)*0)  and  (c'd&(&(b=>a)j)  *  0)].  (5.35') 

i  in  J 

Hence,  the  equivalence  in  eq.(5.35')  shows,  by  negating  through, 

(i)  holds  iff  [  Or(  (&(b=>a)j  <  bj=>d  ) 

i  in  J 

and  [Or(&(b=>a)j  <  b;')  or  (&(b=>a)j  <  d^>c)  ] 

i  in  J 

iff  [  Or(&(b=>a)j<  V)  or  Or(&(b=>a)j  <  (bivd)=>c)  ].  (5.36) 

i  in  J  i  in  J 

However,  because  of  SCPL  consistency  (see  Theorem  5.9(ii))  ,  we  cannot  have  the  left-hand  side 
expression  at  the  bottom  of  eq.(5.36),  [  Or(&(b=>a)j  <  b;')],  holding  true  there.  Hence,  (i)  holds 
iff  the  bottom  right-hand  side  expression  of  (5.36)  holds,  which  is  the  desired  result  (ii).  ■ 

Theorem  5.11.  Basic  relations  among  consistencies  and  deductions. 

Under  Basic  Assumption  I,  the  following  hold: 

(i)  With  respect  to  (alb)j:  either  SCPL  consistency  or  SHPL  consistency  implies 
WHPL  consistency  =w  WCPL  consistency. 

(ii)  Under  also  SHPL  consistency  for  (alb)j, 
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( (alb)j  <whpl  (eld))  iff  ((alb)j  <shpl  (eld))  implies  ((alb)j  <wcpl  (eld)). 


(iii)  Under  also  WHPL  consistency  for  (alb)j, 

( (alb)j  <wHPL(cld))  implies  ((alb)j  <wcpl  (eld)). 

Proof:  Simply  compare  the  algebraic  forms  in  the  previous  Theorems  of  this  section.  ■ 

Theorem  5.12.  Reduction  of  weak  and  strong  CPL  and  HPL  consistencies  and  deductions  to  the 
classical  logic  case  for  unconditional  events. 

Make  Basic  Assumption  I,  with  the  proviso  that  now  bj  =  Q.  =  d.  Then: 

(i)  (alfl)j  is  SHPL  consistent  iff  it  is  WHPL  consistent  iff  it  is  WCPL  consistent 

iff  it  is  SCPL  consistent  iff  &(aj)  0. 

(ii)  Under  the  common  consistency  assumption  in  (i), 

((alft)j  —SHPL  (clfl))  iff  ((alQ)j  —WHPL  (cm))  iff  ((alQ)j  —WCPL  (cm)) 
iff  ((am),  —SCPL  (clQ))  iff  &(aj)  <  c, 

the  same  as  in  classical  logic  (see,  e.g.,  [Copi,  1986],  where  the  basic  conjunctive  deduction 
relation  is  usually  presented  via  the  equivalent  truth-table  form  of  “whenever  all  aj  are  verified 
(or  true),  then  so  must  c  (be  true)”. 

Proof:  Apply  the  simplifying  constraint  bj  =  Q,  =  d  to  all  of  the  previous  relevant  theorems  in  this 
section.  ■ 

Theorem  5.13.  The  behavior  of  minconc  when  deduction  fails  in  the  HPL  or  CPL  senses. 

Under  Basic  Assumption  I: 

(i)  Under  also  SHPL  consistency  for  (alb)j:  If  not((al£2)j  <shpl  (cILi)),  then  not  (al£2)j  <shpl 
(cm)),  and  for  any  real  8,  0  <  8  <  1,  slightly  abusing  notation  by  replacing  (1-8)- lj  by  just  1-8, 

minconc2((alb)j;(cld))(l-8)  <  8. 

(ii)  Under  WCPL  consistency  for  (alb)j:  If  not  ((alU)j  <wcpl  (cIQ)),  then 

minconc2((alb)j;(cld))(lj)  =  0. 

(iii)  Under  SCPL  consistency,  where  also 

Or(  &(b=>a)j  <  bj=>d  ), 

i  in  J 

then,  [  not  (((alfl)j  <scpl  (cILi))  ]  implies  [minconc2((alb)j;(cld))(lj)  =  0]. 
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Proof:  Employ  each  algebraic  characterization  of  the  appropriate  type  of  deduction  (via  the 
above  theorems  in  this  section)  and  consider  the  negation  of  each  characterization  and  choose  a  P 
over  this  region,  usually  with  the  value  1.  ■ 

As  stated  previously,  in  one  way  or  another,  Theorem  5.13  points  out  specifically  the  essentially 
extreme  (0,l)-only  possible  values  for  the  minconc  function,  for  sufficiently  high  threshold 
values.  Thus,  the  minconc  function,  despite  its  use  as  seen  in  the  above  theorems,  cannot  be 
interpreted  as  a  way  of  measuring  “degree  of  deduction”  —  or  softening  of  deduction  of  (cld) 
from  (alb)j  —  when,  instead  of  taking  limits  of  minconc  as  the  premise  thresholds  approach  unity, 
one  holds  them  fixed  and  simply  uses  the  evaluation  of  minconc  as  is.  This  leads  to  considering 
the  meanconc  function  as  a  possibly  more  reasonable  candidate  to  reflect  such  softening  of 
deduction,  as  will  be  verified  in  the  next  section. 

6.  Some  New  Results  and  Insights  in  Expected  Surety  Logic 

6.1  Review  of  Relevant  Definitions  and  Concepts  Required 

Returning  to  the  meanconc  function  and  expected  surety  logic,  consider  the  following:  Suppose 
again  we  make  Basic  Assumption  I,  as  given  at  the  beginning  of  Section  5  and  used  throughout 
there.  Recall  also  the  discussion  at  the  beginning  of  Section  2,  where  a  set  of  m+1  atoms  k  = 
{ai,...,am,am+i }  e  B  (for  given  probability  space  (f2,B,P))  is  given  with  respect  to  ((alb)j;(cld)), 
(and  relative  to  all  boolean  combinations  of  the  antecedents  bj  and  consequents  aj  of  each  proper 
conditional  event  (ajlbj),  j  in  J).  Unless  otherwise  stated,  the  designated  atom  am+i  <  &(b'j), 
therefore,  assumed  nonvacuous.  For  any  probability  measure  P,  we  have  its  natural 
identification  with  the  evaluations  of  P  over  atoms  (Xi,...,  am,  i.e.,  with  values  (P(ai),...,P(am))  = 
(xi,...,xm)  =d  XT,  with  P(am+i)  =  xm+i  =  l-sum(X)  -  (.)T  denoting  vector  or  matrix  transpose  - 
with  X  lying  in  the  m-simplex  Sm  of  all  possible  values  of  such  X:  0m  <  X  <  lm,  sum(X)  <  1,  so 
that  as  P  varies,  X  varies,  etc.  Also,  each  event  c  in  the  boolean  combinations  generated  by  A 
can  be  uniquely  expressed  as  a  disjoint  disjunction  of  certain  of  the  atoms,  written,  c  =  v(ai(C))  =d 
v(cXj)  (using,  again,  multivariable  notation)  where  index  set  1(c)  c=  {1,...,  m,  m+1}  (with 

jin  1(c) 

usually  only  the  first  m  integers  involved).  When  unambiguous,  we  will  interchange  P  with  X 
being  in  Sm,  keeping  in  mind  the  last  component  xm+i  of  P  is  not  really  in  Sm.  Also,  as  in  the 
discussion  in  Section  2,  we  use  the  notation  which  identifies  any  relevant  event  as  an  m  by  1 
column  vector  of  0’s  and  l’s  with  respect  to  the  first  m  components  of  A  (provided  am+i  is  not 
part  of  it). 

Next,  for  each  real  s,  t,  0  <  s,  t  <  1  and  any  real  vector  s,  of  size  card(J),  define  the  following 
subsets  of  Sm: 

At=d  {P  in  Sm:  P(alb)j  >  t}  =w  {X  in  Sm:  hi(X)  >  t}  (common  lower  bound  t),  (6.1.1) 

A©  =d  {P  in  Sm:  P(alb)j  =  s}  =w  {X  in  Sm:  hj(X)  =  s},  (6.1.2) 

Bx  =d  {P  in  Sm:  P0(&0(alb)j)>  t}  =w  {X  in  Sm:  h2(X)  >  t},  (6.1.3) 
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(6.1.4) 


B(s)  =  {P  in  Sm:  P0(&0(alb)j)  =  s}  =w  {X  in  Sm:  h2(X)  =  s}, 

Ct  =d  {P  in  Sm:  P0(&Ac(alb))  >  t}  =w  {X  in  Sm:  h3(X)  >  t},  (6.1.5) 

Qs)  =d  {P  in  Sm:  P0(&Ac(alb))  =  s}  =w  {X  in  Sm:  h3(X)  =  s}.  (6.1.6) 

Here, 

h,(X)  =d  (h1;j(X))j in j ,  h!,j(X)  =d  (ajT-X  /  bjT-X),  j  in  J,  (6.1.7) 

are  bounded  bilinear  functions  in  X.  h2(X)  as  a  function  of  X  is  obtained  similarly  as  hi,  by 
replacing  P  everywhere  appropriately  by  X  in  the  computations  in  eq.  (4. 16')  and,  while  much 
more  nonlinear  in  structure  than  hi(X),  it  is,  nevertheless  also  a  “well-behaved”  function  (i.e., 
differentiable,  bounded,  etc.).  h3(X)  is  obtained  likewise  from  eq.(4.27)  with  P  replaced  by  X, 
noting  that,  just  as  hi(X),  h3(X)  is  also  a  bounded  bilinear  function  in  X.  Also,  for  any  choice  of 
prior  (second  order)  probability  distribution  for  X  over  Sm  -  such  as  typically  here  Dirichlet 
(including  the  uniform  one  over  Sm  as  a  special  case)  -  denote  the  corresponding  cdf  s  with 
respect  to  antecedent  space  At,  Bt,  Ct  by  Fijt  and  the  corresponding  cdf  s  with  respect  to  A(s),  fi(S), 
Qs),  by  Gi.t,  for  any  x  in  Sm  (  or,  more  generally,  in  m-dimensional  real  space), 

Fu(x)  =d  Prob(X  <  x  I  X  in  At) ,  G1;S(x)  =d  Prob(X  <  x  I  X  in  A(s))  ,  (6. 1.8) 

F2,t(x)  =d  Prob(X  <  x  I  X  in  fit) ,  Gu(x)  =d  Prob(X  <  x  I  X  in  B(s)) ,  (6.1.9) 

F3;t(x)  =d  Prob(X  <  x  I  X  in  Q  ,  Gi,s(x)  =d  Prob(X  <  x  I  X  in  C(s)).  (6.1.10) 

6.2  Method  of  Approach  to  Evaluation  of  Meanconc 

It  is  clear  that  for  the  fixed  (i.e.,  non- unity  limiting)  threshold  case  the  determination  of  the  two 
types  of  meanconc  functions  -  i.e.,  E(P(cld)  I  P  in  At)  or  E(P(cld)  I  P  in  A(t))  -  requires  evaluation 
in  general  of  multiple  integrals  over  the  polytope  formed  from  Sm  and  the  constraints  via  At  or 
A(t).  Tables  1  and  2  illustrate  closed-form  results  for  a  number  of  special  types  of  premise  sets 
and  potential  conclusions.  Even  there,  in  some  cases,  such  as  transitivity,  full  evaluation,  under 
the  simplest  assumptions,  such  as  choice  of  a  uniformly  distributed  prior  for  P,  required  lengthy 
integration  evaluations.  (See  Section  6.3  of  [Goodman,  1999]  for  an  outline  of  the  calculations 
for  the  transitivity  case.)  On  the  other  hand,  there  have  been  many  advances  in  the  area  of  such 
calculations,  as  seen,  e.g.,  in  [Bisztriczky  el  al.,  1994],  which  potentially  can  significantly  reduce 
such  calculations.  (See  also  [Goodman  &  Nguyen,  2000]  for  further  discussion.) 

With  the  above  in  mind,  let  us  consider  an  alternative  to  either  attempting  to  obtain  full  closed- 
form  evaluations  or  direct  numerical  approximations.  The  direction  here  is  one  of 
approximation,  but  in  the  following  sense:  We  first  attempt  to  show  that,  in  an  asymptotic  sense, 
the  expectation  antecedents,  (P  in  At),  (P  in  A(t)),  are  essentially  equivalent  in  their  effect  upon  the 
potential  consequent  P(cld)  as  the  related  forms  fit,  fi(t)  (where  P  satisfies  P0(&0(alb)j)  >  t,  =  t) 
and  Ct,  C(t)  (where  P  satisfies  P(&Ac(alb)j)  >  t,  =  t).  In  addition,  another  justification  for 
considering  replacement  of  At  (but  not  A(t),  etc.)  by  the  corresponding  spaces  determined  by  &Q 
or  by  &ac  is  the  “intertwining”  -  or  asymptotic  intertwining  property  they  possess  with  respect 
to  one  another.  That  is,  any  one  of  the  three  spaces,  At,  fit,  Ct,  determined  by  the  separate 
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constraints  P(ajlbj)  >  t,  essentially  lie  inside  any  of  the  other  two  for  t  appropriately  changed  -  but 
still  retaining  a  value  relatively  close  to  unity.  While  this  property  alone  does  not  guarantee 
convergence  of  the  corresponding  conditional  expectations  to  each  other  as  t  approaches  unity,  it 
is  an  enhancing  property  of  the  closeness  of  the  three  spaces  to  each  other  asymptotically  (see 
Theorem  6.5).  Fortunately,  Theorem  6.2  reinforces  this  closeness  in  demonstrating  that,  indeed, 
under  mild  conditions,  all  three  spaces  do  lead  asymptotically  to  the  same  relevant  conditional 
distributions  and  expectations !  While  it  is  of  some  interest  to  know  that  &0(alb)j  could  be  used  at 
least  in  theory  for  the  desired  replacement  at  some  reasonably  high  (though  not  necessarily 
exactly  unit)  threshold  level,  inspection  of  its  structure  in  eqs.(4.14)-(4.16/)  compared  to  that  of 
the  far  simpler  -  and,  in  fact,  single  conditional  event  reducing  —  &Ac(alb)j  (see  eqs.(4.23)- 
(4.27))  shows  the  latter  as  the  preferred  candidate  (even  though  it  is  not  a  true  conjunction 
operator,  etc.). 

Ideally,  one  would  then  determine  the  replaced  conditional  expectation  E(P(cld)  I  P(&AC(alb)  >  t) 
(or  E(P(cld)  I  P(&AC(alb)  =  t)).  However,  even  in  this  case,  preliminary  results  (at  this  point) 
indicate  there  is  still  considerable  complexity  of  the  required  integration  procedure  -  although 
these  investigations  do  show  a  basic  connection  with  the  evaluation  of  certain  integrals  of 
hypergeometric  functions  of  higher  order.  But,  final  success  can  be  achieved  in  a  modified  way, 
by  adapting  an  approach  analogous  to  that  employed  by  the  popular  naive  maximum  entropy 
approach  E(P(cld)  I  P  in  At),  where  first  a  criterion  is  satisfied  -  i.e.,  maximizing  the  possible 
(first  order  probability)  entropy  and  then  the  result  is  plugged  into  the  objective  function,  i.e.,  the 
maximizing  entropy  probability  measure  P*  is  then  used  to  evaluate  P*(cld).  Thus,  as  the 
counterpart  to  the  above,  we  seek  first  to  find  that  probability  measure  P#  which  is  most  central 
to  the  premise  set,  i.e.,  we  seek  to  obtain  P#  =  E(XI  X  in  At)  or  the  similar  expression  where  At  is 
replaced  by  A(t)  (see  also  the  limiting  forms  in  Theorem  6.2(iii)  below)  and  then  “plug-in”  to 
evaluate  P#(cld).  For  simplicity,  we  shall  consider  only  E(XIX  in  A(t))  in  the  actual  evaluations, 
carried  out  in  Theorems  6.6  and  6.7. 

The  next  result  is  stated  in  part  in  [Goodman  &  Nguyen,  2000]  and  is  presented  here  in  slightly 
different  form  for  clarity.  It  is  a  generalization  of  a  basic  surface  integral  result  due  originally  to 
[Higgins,  1975],  with  related  work  carried  out  by  [Saw,  1973]: 

Theorem  6.1.  Decomposition  into  a  weighted  sum  of  ratios  of  surface  integrals  of  any 
conditional  cdf  whose  antecedent  is  generated  from  the  implicit  solution  to  a  well  behaved 
function  being  constant. 

Let  (X2,B,P)  be  a  real  probability  space,  X  an  associated  m  by  1  random  vector  over  some  domain 
S  in  (Real  line)m  with  joint  pdf  f  which  is  continuous  and  uniformly  bounded  over  S.  Let  n  be  a 
positive  integer  with  n  <  m  and  let  h:S— KReal  line)11  be  a  “well-behaved”  function  (uniformly 
bounded  above  and  below  away  from  zero,  differentiable,  etc.).  Then,  for  any  event  ccS  and 
any  real  vector  s  in  range(h), 

P(X  in  c  I  h(X)  <  s)  =  |(  p(h,f;c)(r)-g(h,f)(r))dr  (ordinary  integral),  (6.2.1) 

r<s 

where 

P(h,f;c)(r)  =d  V|/(h,r;f,c)/v|/(h,r;f,S),  (6.2.2) 
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(6.2.3) 


g(h,f)(r)  =d  \|/(h,r;f,S)  /  j  ( \|/(h,v;f,S)  )dv  , 

v<r 

and  surface  integral 

\|/(h,s;f,c)  =d  J  (f(x)  /[det(dh(X)/dX)(dh(X)/dX))T)] )  dsurfh;S(X)  ;  (6.2.4) 

X  in  surf  h  s  n  c 

surfM=  {X  in  S:  h(X)  =  s]  =  h4(s).  (6.2.5) 

(See,  e.g.,  [Devinatz,  1968]  for  details  of  general  surface  integration;  the  surface  integral  in 
eq.(6.2.4)  can  also  be  converted  back  to  ordinary  integral  form,  but  will  not  be  needed  here.) 
Note  that  the  non-negative  (“well-behaved”)  function  in  r,  p(h,f;c)(r)  is  bounded  by  unity  and 
that  g(h,f)(r)  as  a  function  of  r  is  a  legitimate  pdf.  ■ 

Remark. 

Note  first  the  vector  derivative  in  eq.(6.2.4)  is  the  n  by  m  matrix  of  partial  derivatives  of  the 
various  scalar  component  function  of  h  with  respect  to  each  single  argument  and,  as  part  of  the 
“well-behaved  “  property  of  h,  it  is  assumed  to  be  of  full  rank  n  (<  m)  so  that  the  factor 
det(dh(X)/dX)(dh(X)/dX))T)  >  0  in  all  X  in  surfh,s,  for  all  s  in  range(h). 

The  original  form  of  the  above  theorem  does  not  appear  as  a  weighted  sum:  the  factor 
y(h,r;f,S)  =  J  (f(x)  /[det(dh(X)/dX)(dh(X)/dX))T)] )  dsurfh,r(X) 

X  in  surf h  snS 

=  J(f(x)/[det(dh(X)/dX)(dh(X)/dX))T)])dsurfh,r(X)  (6.2.6) 

X  in  surfh  s 

cancels  out  in  eq.(6.2.1). 


Theorem  6.2.  Three  equivalent  limiting  forms  involving  meanconc. 

Suppose  that  Basic  Assumption  I  holds.  For  a  second  order  probability  prior  over  Sm,  choose 
random  vector  X  (representing  a  random  probability  measure)  to  be  distributed  so  that  its  pdf 
over  Sm  is  bounded  and  continuous.  For  example,  we  can  choose  the  Dirichlet  distribution  dir(x) 
for  X  with  parameter  x  such  that  x  >  1.  Suppose  also  that 

Ai  =d  n(At )  *  0  .  (6.2.7) 


(The  condition  in  eq.(6.2.7)  can  be  analyzed  via  Theorem  5.9  for  consistency  of  SHPL.)  Then: 

(i)  Referring  to  eqs.  (6.2.1)-(6.2.3),  each  of  the  three  ordinary  cdf  s  Fp  can  be  identified  as 
weighted  averages  of  the  corresponding  cdf’s  Gj,s  over  the  exact  threshold  spaces,  all  identified  in 
the  natural  sense  with  the  expansion  in  Theorem  6,1:  For  all  x  in  Sm, 
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Fu(x)=  j(Gi,_s(x).gl(s))ds,  F2;t(x)  =  J(G2,s(x).g2(s))ds,  F3>t(x)=  j(G3>s(x).g3(s))ds  . 

t<s<l  t<s<l  t<s<l 

The  above  representations  rigorize  formal  readily-derived  counterparts  which  simply  use 
integration-out  of  variables  and  chaining. 

(ii)  limit  (Fu(x))  =  Gi,i(x)  ,  limit  (F2>t(x))  =  G2,i(x)  ,  limit  (F3;t(x))  =  G3,i(x)  , 

tTl  til  til 

where,  for  the  limiting  cdf’s 

Gi,i(x)  =  G2,i(x)  =  G3J(x)  =d  G(x),  all  x  in  Sm. 

(iii)  limit  (E(XI  X  in  At))  =  limit  (E(XI  X  in  B,))  =  limit  (E(XI  X  in  Ct)) 

tTl  tTl  tTl 

=  E(X)  for  X  assigned  cdf  G. 

(iv)  limit  (E(P(cld)  I  P  in  At))  =  limit  (E(P(cld)l  P  in  Bt))  =  limit  (E(P(cld)l  P  in  Ct)) 

tTl  tTl  tTl 

=  E(P(cld))  for  P  (or  X)  assigned  cdf  G. 

Proof.  First  replace  separately  in  Theorem  6.1,  h  by  hj,  j=l,  2,  3.  Also,  replace  there  c  by  the 
infinite  left  ray  at  x  in  real  m-space,  and  S  by  Sm.  This  yields  (i).  Then,  noting  that,  because 
each  cdf  is  a  weighted  sum  of  the  corresponding  exact  threshold  cdfs  with  range  space  t  <  s  <  1  , 
as  a  typical  example,  squeezing  down  to  unity  itself  as  tTl,  and  that  all  cdfs  are  “well- 
behaved”,  etc.,  the  top  part  of  (ii)  also  holds.  That  the  bottom  part  of  (ii)  holds  is  because,  by 
inspection: 

hi(X)  =  1  iff  P(alb)j  =  1;  h2(X)  =  1  iff  P0(&0(alb)j)  =  1  iff  P(alb)j  =  1  ; 

h3(X)  =  1  iff  P0(&Ac(alb)j)  =  1  iff  P0((&AC(alb)J)/)  =  0  iff  P(v(a/b)J)  =  0 
iff  P(aj'bj)  =  0,  all  j  in  J  (but  P(bj)  >  0)  iff  P(alb)j  =  1. 

Finally,  (iii)  holds  by  simply  applying  the  extended  Helly-Bray  moment  theorem  separately  to 
each  of  the  three  converging  sequences  of  cdfs  for  both  the  identity  function  in  X  and  the 
bilinear  function  in  X  representing  P(cld).  both  being  continuous  bounded  functions  of  X.  (See, 
e.g.,  [Loeve,  1963],  Sections  11.3,  11.4.)  ■ 

Theorem  6.2  insures  the  mutual  asymptotic  equivalence  of  meanconc  with  respect  to  either  the 
original  separate  premise  conditions,  their  PS  conjunction  forming  one  compound  conditional 
form  in  PSCEA,  or  their  replacement  by  a  significantly  simpler  single  proper  conditional  via 
&ac-  However,  the  next  results  also  reinforce  this  approximate  equivalent  asymptotic  behavior 
in  another  direction. 
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Theorem  6.3.  A  lower  bound  on  E(X  in  A,  I  X  in  Ct). 

Suppose  Assumption  I  holds,  as  well  as  SCPL  consistency  (see  Theorem  5.9).  Suppose  also  that 
P,  i.e.,  X,  is  distributed  over  Sm  as  Dirichlet  dir(x),  where  parameter  vector 
I  =  (Xi,...,Xm,Xm+l)  —  lm+l 

Then,  there  exists  an  increasing  function  g:  [0,1] — >  [0,1]  with  g(0)  =  0,  g(l)  =  1,  such  that  for  all 
t,  0  <  t  <  1, 


g(t)  <  EP(And(P(alb)j  >  t)  I  P(&Ac(alb)j)  >  t). 

One  can  choose  for  g,  without  loss  of  generality  (for  at  at  least  all  t  sufficiently  close  to  unity) 

g(t)=  (l/(l  +  V(l-t)/t))'[l-  max  (Fj(V(l-t)/t  ))],  (6.2.8) 

jin  J 

where  Fj  is  the  cdf  of  the  beta(Xi,i,x2,i)  distribution; 

Xu=d  £(Xj);  cpi=dai&(&(b^a)J)^0;  x2,i=d  £(Xj).  (6.2.9) 

j  in  I(tpj )  jinI(A— up;) 


Proof:  Since  &AC(alb)  =  (A  I  A  v  A'B),  A'B  =  &(b')j,  with  A  defined  as  in  eq.(6.2.9/), 

A  =d  (&(b=>a)j)&(v(b)),  (6.2.90 

P((&AC(alb))  >  t  iff  P(A'B)  <  ((l-t)/t)P(A).  Hence, 

P(ailbi)  >  P(cpi)/(P(cpi)+  P(A'B))  >  Uj  /  (Uj  +  ((l-t)/t)) ,  (6.2.10) 

where,  from  the  theory  of  Dirichlet  distributions  applied  to  random  vector  P  (see,  e.g.,  [Goodman 
&  Nguyen,  1999a]  or  [Wilks,  1963]), 

U;  =d  P(cpO  /  P(A)  is  distributed  as  beta(ti,i,  t2,i)  ,  independent  of  P(A)  and  P(A'B).  (6.2. 1 1) 

Hence,  denoting  the  pdf  for  beta(ti,i,t2,i)  as  h;,  the  required  expectation  here  is 

i 

E(Ui/(Ui  +  ((l-t)/t)))=  J[(u/(u+((l-t)/t))hi(u)]du.  (6.2.12) 

u-0 

Breaking  up  the  range  of  integration  in  (6.2.12)  into  two  parts,  [0,  ^(1  -  t)/t  ]  and  tj/t ,  1], 

it  is  clear  that  u/(u+((l-t)/t))  over  the  first  interval  varies  between  t  and(l/ (1  +  ^/(  1  -  t)/t ))  while 

u/(u+((l-t)/t))  over  the  second  interval  varies  from  0  to  (l/ (1  +  ^/(l  —  t)/t)) .  Since  x  >  lm+i  ,  there 
is  a  unique  finite  maximum  m\  given  as 

m  =  hi((Xi,i—  l)/(  X14—  1  +  x2,i—  1)),  i  in  J,  (6.2.13) 
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with  limiting  (uniform  distribution)  case  interpreted  as  1,  when  Xij  =  Xij  =  1.  (For  justification  of 
eq.(6.2.13),  see  [Johnson  &  Kotz,  1972],  vol.  2,  Chapter  24,  Section  3.) 

Finally,  putting  eqs.(6..2.10)-(6.2.13)  together, 

(l/(l  +  V(l-t)/t))  (1-Fi(  V(1  -  t)/t ) )  <  EP(And(P(alb)j  >  t)  I  P(&Ac(alb)j)  >  t),  for  i  in  J.  ■ 

Theorem  6.4.  Some  additional  bounds  connected  with  Theorem  6.3. 

Under  the  same  assumptions  as  in  Theorem  6.3,  for  all  0  <  t  <  1,  for  all  i  in  J, 

(i)  Var(P(ailbi)  I  P(&AC(alb)j)  >  t)  <  (l-g(t))  E(P(a1lb1)  I  P(&AC(alb)j)  >  t)  <  l-g(t). 

(ii)  For  all  real  8,  0  <  £  <  1, 

Prob  (IP(ailbi)  -  E(P(ailbi)  I  P(&AC(alb)j)>  t)l  <  e)  >  1  -  (l-g(t))/e2. 

(hi)  Prob(P(ailbi)  >  g(t)-(  1  -g(t)) 1/3  I  P(&Ac(alb)j)  >  t)  >  1  -  (l-g(t))1/3, 
for  all  t  sufficiently  close  to  1. 

(iv)  Prob(And[P(ailbi)  >  g(t)-(l-g(t))1/3]  I  P(&Ac(alb)j)  >  t)  >  1  -  (l-g(t))1/3, 

i  in  J 

for  all  t  sufficiently  close  to  1. 

Proof:  Straightforward  use  of  Theorem  6.3  together  with  Chebychev’s  inequality,  where,  letting 
Y;  =d  P(ailbi),  Zi,t  =dE(Yi  I  Ct),  o2ijt  =d  Var(P(ailbi)  I  P(&AC(alb)j)  >  t), 


Prob(Yj  >  g(t)  -  8  ICt)  >  Prob(Yi  >  ZlX-  8  ICt)  >  Prob(IYi-Zi,tl  <  8  ICt)  >  1-  cn.t/e2 

>  1  -  (l-g(t))/82 

and  then  choosing  8  =  (l-g(t))1/3 .  ■ 

Definition.  Given  any  probability  space  (f2,5,P)  and  two  collections  A  =  (At)o<t<i  ,  B  =  (Bt)o<t<i 
with  At,  Bt  in  B  and  each  collection  nesting  down  as  t  increases.  Then,  say  that  A  and  B  are 
intertwined  iff  there  are  functions  f,  g:  [0,1]  — >[0,1]  increasing  and  continuous  with  f(0)  =  g(0)  = 
0,  f(l)  =  g(l)  =  1,  such  that 


Hence,  conversely, 


Bf(t)  <  At  <  Bg(t) ,  for  all  t,  0  <  t  <  1 . 


Ag.i(t)  <  B,  <  Arht) ,  for  all  t,  0  <  t  <  1. 

Lemma  6.1  Let  A  =  (At)o<t<i  ,  B  =  (Bt)o<t<i  be  two  intertwining  collections  as  in  the  definition. 
Let 
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Ai  =d  &  (A  )  and  Bi  =d  &  (B  )  .  Then,  Ai  =  Bi.  Hence,  if  one  is  nonvacuous,  so  will  be  the 

0<t<l  0<t<l 

other. 


Proof.  A  i 


--  &(At) 

0<t<l 


&  (B 

0<t<l 


g(t)  - 


&(Bt) 

0<t<l 


Bi  -  &.(Af-1(t)) 


0<t<l 


=  &(At): 

0<t<l 


Theorem  6.5.  Intertwining  and  almost  intertwining  of  At’s,  Bt’s,  Ct’s  as  given  in  eqs.(6.1), 
(6.3),  (6.5). 

Since  the  At’s,  Bt’s,  Ct’s  are  sets  of  probability  vectors  (the  P’s  or  the  X’s  in  our  notation),  we  use 
ordinary  subset  notation  here.  Under  the  same  assumptions  of  Theorem  6.3,  the  following 
relations  hold  for  any  t,  0  <  t  <  1, 

At  C  Bi-  (l-t)card(J)  ^  Cl-  (l-t)card(J)  >  Ai_(i_t)/Card(J)  £  Bt  £  Ct  .  (6.2.14) 

noting  that  Ct’s  in  eq.(6.2.14)  dominate  all  subset  inclusions.  As  a  partial  converse,  where  g(t)  is 
provided  from  Theorem  6.3, 

Prob  (X  in  Ag(tH1.g(t))i/3|  X  in  Ct)  >  1  -  (l-g(t))1/3, 
i.e., 

Prob(Ct  c  Ag(t).(1.g(t))i/3)  >  l-(l-g(t))1/3.  (6.2.15) 

Proof:  Straightforward  use  of  FHH  inequalities  (see  eqs.  (1.1.5)  and  (5.22))  and  Theorem  6.4. 


Theorem  6.6.  Closed-form  expression  for  E(PI  P(alb)  =  t) 

Make  the  Basic  Assumption  I,  where  now  J  =  { 1 } ,  A(t>  =  {P:  P(alb)  =  t},  and  vector  partition  X  as 
X  =  (Xi,...,Xm)  =  (X(i)  ,  X(2)  ,  X(3)  )  ,  X(1)  =  (Xi,...,Xn),  X(2)  =  (Xn+i,...,n+p), 

X(3)T  =  (xn+p+ ,,..., xm),  where,  as  usual  Xj  =  P(tXj),  xm+i  =  1-  sum(X),  etc.  Suppose  also  that  X  is 
distributed  over  Smas  Dirichlet  dir(x),  where  x  =  (ti,...,tm,tm+i)  >  0m+i.  Then,  E(PI  P(alb)  =  t)  = 
E(X  I  X  in  A(t))  is  in  A(t)  and 

E(XI  X  in  A(t))T  =  E(P  I  P(alb)  =  t)T  =  (  E(X(1)  I  A(t))T,  E(  X(2)  I  A(t))T  ,  E(X(3)  I  A(t))T  ), 
where 

E(X(1)  I  A(t))T  =  fW(iy(Wi,...Wn)  ,  E(X(2)  I  A(t))T  =  (l-t)-w(i)<wn+i,...wn+p)  , 

E(X(3)  I  A(t))T  =  w(2)'(wn+p+i,...,wm) ,  E(xm+i  I  A(t))  =  1-  sum(E(XlAa)))  =  l-w(i)-w(2)  =  w(3). 


where 

wm  =d  (Co  +  C2))  /  (d)  +  x(2)  +  x(3)  +  xm+0  ,  W(2)=  X(3)  /  (Ci,  +  x(2,  +  x,3,  +  xm+i), 

W(3)=  Xm+1  /  (X(1)  +  X(2)  +  X(3)  +  xm+i). 

Cl)  =dX[  +  ...+Xn,  X(2)  =d  Xn+1+...+Xn+p,  X(3)  =d  Xn+p+|  +  ...+Xm 

Wi  —  ( 1  /X(  1 ))  X i , . . . ,  Wn  —  (1/X(1))-X„  ,  Wn+1  —  (l/X(2))'Xn+i,...,  Wn+p —  ( l/X(2))'Xn+p, 
Wn+p+l  —  (l/X(3))'Xn+p+i,...,  Wm  —  (l/X(3))'Xm.. 
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Proof.  Since  A(t)  is  a  (closed  )  convex  set,  E(X  I  X  in  A(t))  is  also  in  A(l}.  Next,  consider  three 
cases:  x;  for  i  =  l,...,m,  letting  U  =  sum(X(i)),  V  =  sum(X(2)),  W  =  sum(X(3)),  and  in  this  notation, 
A(t)  holds  iff  U/(U  +  V)  =  t: 

Case  1.  i  =  1,...,  n. 

E(xilA(t))  =  E((xi/U)  (U/(U+V))  (U+V)  I  U/(U+V)  =  t). 

But,  from  the  theory  of  Dirichlet  distributions  (see,  again,  [Wilks,  1963]  or  [Goodman  & 
Nguyen,  1999a]),  the  random  variables  (x;/U),  U/(U+V),  (U+V)  are  all  independent  of  each  other 
and  have  the  beta  distribution.  Since  the  middle  one  is  determined  from  the  antecedent  of  the 
expectation  to  be  t,  we  need  only  specify  the  first,  which  is 
beta(x„x(i)-ti)  and  the  third  which  is  beta(x(i)+X(2),X(3)+xm+i),  so  that 
E(XilA(t))  =  (X1/X(i))'t'((X(i)+X(2))/(X(i)+ X(2)+X(3)+Xm+i)). 

Case  2.  i=n+l,...,n+p. 

E(xilA(t))  =  E((xi/V)  (V/(U+V))  (U+V)  I  U/(U+V)  =  t) 

=  E((xi/V)  (l-(U/(U+V)))  (U+V)  I  U/(U+V)  =  t), 

noting  the  mutual  independence  and  beta  distributions  of  (Xj/V),'  U/(U+V),  (U+V),  with  the 
middle  term  determined  from  the  antecedent  of  the  expectation  as  1-t.  Again,  as  in  case  1,  the 
beta  distribution  expectations  are  readily  obtained. 

Case  3.  i=n+p+l,...,m 

E(xilA(t))  =  E((Xi/W>Wl  U/(U+V)  =  t), 

noting  the  mutual  independence  and  computable  beta  distributions  of  (x/W),  W,  U/(U+V).  ■ 


Theorem  6.7.  Plug-in  evaluation  P#(cld)  relative  to  P#  =d  E(P  I  P(alb)  =  t). 

Under  the  same  hypothesis  as  Theorem  6.6,  noting  again  P#  =  E(P  I  P(alb)  =  t)  is  a  legitimate 
probability  vector  in  A(t)  because  of  the  closed  convexity  of  A(t|  =  { P:  P(alb)  =  t}.  Using  the 
multivariable  notation  E(W)i(C)  for  ^  1 w , )  >  where  1(c)  c=  [l,...,m+l]  is  the  index  set  for  c  with 

jin  1(c) 

respect  to  set  of  atoms  A0, 

P#(cld)  =  N#  /  D#  ; 

N#  =  P#(c)  =  P#(ac)  +  P#(abc)  +  P#(b'c) 
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=  t'W(i)'S(w)i(ac)  +  (l-t)-W(iyL(w)i(a-bc)  +  W(2)-£(w)i(b-c)  , 


D#  =  P#(d)  =  P#(c)  +  P#(c/d)  =  N#  +  Plac'd)  +  P#(a,bc,d)  +  P#(b,c,d) 

=  N#+  t-W(i)E(w)i(aC'd)  +  (l-t)-W(l)-Z(w)i(a-bc'd)  +  W  (2)S(W  )l(b'c'd) 

Proof:  Straightforward  substitution  of  values  from  Theorem  6.6  into  P#(cld),  using  the 
decomposition  P#(cld)  =  P#(c)  /  (P#(c)  +  P#(c'd)) ,  P#(c)  =  (P#(ac)  +  P#(a'bc)  +  P#(b'c)), 

P#(c'd)  =  P#(ac'd)  +  P#(a'bc'd)  +  P#(b'c/d).  ■ 

7.  Summary  of  Algebraic  Characterization  of  Asymptotic  Form  of  Expected  Surety 
Deduction  for  Common  Threshold  Case 

Apropos  to  the  discussion  in  Section  6.2  concerning  the  difficulty  in  exactly  determining 
meanconc  functions  for  the  fixed  nonlimiting  threshold  case,  quite  a  different  story  holds  for  the 
evaluation  of  these  functions  when  the  thresholds  are  allowed  to  approach  unity.  [Bamber, 
2000],  under  a  uniform  second  order  prior  distribution  assumption,  in  the  same  spirit  of  this 
paper,  has  derived  an  algebraic  characterization  for  deciding  when  E(P(cld)  I  P  in  At)  approaches 
unity  as  threshold  t  approaches  unity.  This  algebraic  procedure  (involving  in  Bamber’s 
terminology,  “rarity  functions”)  is  equivalent  to  (also,  algebraic)  deduction-validating  procedures 
in  Pearl’s  [1990]  System  Z  and  in  Lehmann  &  Magidor’s  [1992]  rational  closure.  (In  fact  all  of 
the  above  work  for  common  threshold  t  has  been  extended  to  non-identical  thresholds 
approaching  unity  via  some  power  of  one  another  -  see,  e.g.,  [Bamber,  2000]  for  further  details.) 
We  describe  below,  in  outline  form,  an  equivalent  process,  which  also  allows  us  to  obtain  the  full 
asymptotic  distributional  form  associated  with  meanconc.  This  process,  is  a  sequential  one, 
where  in  step  one  given  potential  EPL  conclusion  (cld)  is  first  tested  as  to  which  combination  of 
c&(&(b=>a)j)  and/or  c'd&(&(b=>a)j)  is  null  or  not  and  then,  only  if  the  first  level  indeterminate 
case  holds,  i.e.,  c&(&(b=>a)j)  =  c'd&(&(b=>a)j)  =  0,  then  one  proceeds  to  refine  this  further  by 
replacing  J  in  step  one  by  Ki  =  [i  in  J:  a;  &((&(b=>a)j)  =  0}  (involving  the  SCPL  criterion  for 
the  Ki  index  set  -  see  again,  Theorem  5.9).  Then,  the  procedure  is  repeated  with  again  one 
continuing  on  to  the  next  level  only  if  the  indeterminate  case  c&(&(b^>a)K1)  =  c'd&(&(b=>a)Kl) 

=  0.  Re-examining  the  above  a  little  more  closely  and  making  the  usual  general  Dirichlet 
second  order  prior  probability  assumption  for  P  (or  X),  one  can  readily  obtain  the  full  asymptotic 
limiting  distribution  of  (P(cld)  I  At)  as  t  approaches  unity:  Returning  to  the  first  stage,  consider 

Step  1,  Case  1.  c&(&(b=>a)j)  ^  0  and  c'd&(&(b=>a)j)  ^  0.  By  simply  decomposing  c  and  c'd 
relative  to  P(cld)  into  their  intersections  with  &(b=>a)j)  and  its  complement  v(a'b)j)  and  noting 
that  the  constraints  in  At  show  that  for  t  close  to  unity  any  probability  assigned  to  anything 
intersecting  some  a/bj  will  be  negligible  (unless  all  probabilities  are  so).  This  shows  that  a 
typical  P(cld)  is  essentially  the  same  as 

P(c&(&(b=>a)j))  /  [P(c&(&(b=>a)j))  +  P/c'd&l&lb^a);))] 

which  has  no  further  constraints  upon  it  as  t  approaches  unity  and,  from  basic  Dirichlet  family 
properties,  as  a  random  variable  has  a  beta  distribution  with  nontrivial  computable  parameters. 
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Hence,  in  general,  for  this  case  neither  unity  nor  zero  mass  point  asymptotic  distributions  occur 
in  general. 

Step  1,  Case  2.  c&(&(b=>a)j)  =  0,  i.e.,  c  <  (v(a'b)j),  and  c'd&(&(b=>a)j)  ^  0. 

By  reasoning  similar  to  Case  1,  the  asymptotic  form  of  a  typical  P(cld)  must  be 

Plc&Ma'b);)  /  [P(c&(v(a'b)j)  +  Plc'd&Ma'b),)], 

where  clearly  P(c&(v(a,b)J)  approaches  zero,  but  P(c/d&(v(a,b)J)]  remains  independent  of  t. 
Hence,  this  case  stochastically  leads  to  a  zero  mass-point  for  the  limiting  form  of  (P(cld)  I  At). 

Step  1,  Case  3.  c&(&(b=>a)j)  ^  0,  and  c'd&(&(b=>a)j)  =  0.  i.e.,  c'd  <  v(a'b)j. 

The  asymptotic  limiting  form  here  for  (P(cld)IAt)  is 

P(c&(&(b=>a)j)  /  P(c&(&(b=>a)j), 

i.e.,  a  unity  mass-point  distribution. 

Step  1,  Case  4.  c&(&(b=>a)j)  =  c'd&(&(b=>a)j)  =  0. 

Replacing  J  above  in  the  four  cases  of  Step  1  by  Ki,  leads  to  the  following  possibilities: 

Step  2,  Case  1.  c&(&(b^>a)K1)  =£  0,  and  c/d&(&(b=>a)K1)  =£  0. 
yet, 

c&(&(b=>a)j)  =  0,  and  c'd&(&(b=>a)j)  =  0. 

Then,  analogous  to  the  reasoning  for  Step  1,  Case  1,  while  both  numerator  and  denominator  of 
P(cld)  go  to  zero  as  t  approaches  unity,  the  rate  of  convergence  to  zero  for  P(c&(&(b=>a)Kl)  and 

P(c/d&(&(b=>a)K1)  remain  an  order  of  magnitude  less  than  the  complement  terms.  Whence,  the 
same  formal  situation  once  more  holds  in  that  the  dominating  expression  for  P(cld)  is 

P(c&(&(b=>a)K1)  /  [P(c&(&(b=>a)Kl)  +  P(c/d&(&(b^a)K1)] 

which  is  beta  distributed  asymptotically,  etc. 

One  then  continues,  just  as  in  each  Case  of  Step  1,  establishing  the  lower  rate  of  zero  convergent 
expressions.  If  Case  4  of  Step  2  holds,  then  we  must  refine  even  further,  replacing  now  Kj  by  BO 
=  {i  in  K| :  a;  &((&(b=>a)K1)  =  0}  and  continuing  the  process  which  is  guaranteed  to  end  in  a 

finite  number  of  steps. 
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8.  Closing  Remarks  and  Some  Research  Issues. 


Analogous  to  Section  5,  where  various  relations  were  established  between  the  weak  and  strong 
forms  of  HPL  and  CPL  deduction,  one  can  determine  various  relations  between  EPL  deduction 
and  HPL  and  CPL  deduction.  At  the  outset,  as  in  Theorem  5.12  for  CPL  and  HPL,  EPL 
deduction  reduces  to  classical  logic  deduction  relative  to  premise  and  potential  conclusion 
consisting  of  only  unconditional  events.  Lor  simplicity,  we  have  tacitly  only  considered  the 
strong  form  of  EPL,  where  all  conditional  probabilities  involved  are  well-defined  and  will 
continue  to  consider  that  case  here  for  HPL  and  CPL,  as  well,  when  possible.  Lirst,  it  is  clear 
from  the  very  definition  of  minconc  and  meanconc  (see  Section  3)  that  we  always  have 

minconc((alb)j;(cid)(t)  <  meanconc((alb)j;(cld)(t),  for  0<  t  <  1.  (8.1) 

Eq.(8.1)  immediately  implies  that 

(alb)<HpL  (eld)  implies  (alb)  <EPL  (eld),  (8.2) 

where  the  converse  does  not  hold  in  general  as  seen  in  Table  1,  with  transitivity,  contraposition, 
and  strengthening  being  examples  of  EPL  deductions  but  not  HPL  ones.  On  the  other  hand, 
Theorem  6.2(iv)  shows  that 

(alb)<EPL  (cld)  implies  (alb)  <Cpl  (cld).  (8.3) 

As  in  eq.(8.2),  the  implication  in  eq.(8.3)  is  not  reversible  in  general.  A  case  where  CPL  (weak) 
deduction  is  valid,  but  EPL  is  not,  is  provided  by  the  Nixon  diamond  scheme  (number  23  in 

Table  1).  (Bamber  [2000]  has  also  considered  relations  between  EPL,  HPL,  and  CPL  deduction 

validity.) 

We  illustrate  here  the  nonmonotonicity  of  EPL  vs.  the  monotonicity  of  HPL  and  CPL,  where 
again,  it  should  be  noted  that,  by  their  very  definitions,  once  a  valid  deduction  holds  in  the  sense 
of  HPL  or  CPL,  with  respect  to  P(cld)  for  given  (alb)j,  it  must  hold  in  the  same  sense  for  any 
given  increased  premise  set  (alb)  JuK.  Consider  the  Penguin  Triangle  Deduction  Scheme 
(number  17  of  Table  1),  where  in  our  general  deduction  notation  (cld)  is  (a'b  Ic),  J  =  [1,2, 3, 4], 
with  (ajlbi)  being  (alb),  (a2lb2)  being  (blc),  (a3lb3)  being  (die),  and  (a4lb4)  being  (a'b  Id),  with  as  in 
all  of  Table  1,  a,  b,  c,  d  not  otherwise  constrained  (as  opposed  to  Assumption  I)  Lirst,  form 
&(b=>a)j  =  b'c'd'  v  abe'd'.  Then,  as  in  the  procedure  in  Section  7,  test  for  which  case  holds  in 
Step  1  for  &(b=>a)j  conjoining  here,  nonvacuously  or  not,  a'c,  ac.  Thus,  only  Step  1,  Case  4 
holds  where  both  intersections  are  null.  Then,  form  Ki  =  {i  in  J:  a;  &  (b'c'd'  v  abe'd')  =  0}  = 
{2,3,4}.  In  turn,  now  form  &(b=>a)Kl  =  c'd  v  a'bd  and  retest  for  which  possible  nonvacuous- 

vacuous  combination  with  a'bc,  abc  holds.  This  yields  only  Step  2,  Case  3,  since  a'bc&c'd  (c'd  v 
a'bd)  =  a'bcd  ^0,  abc&(  d  v  a'bd)  =  0.  Hence,  in  the  limit  as  t  approaches  1,  (P(cld)IA,) 
approaches  the  mass-point  1,  i.e.,  E(P(cld)IAt)  approaches  1  and  EPL  validity  holds,  where  At 
corresponds  to  J={  1,2, 3, 4}  here. 
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On  the  other  hand,  for  the  same  potential  conclusion  (a'b  Ic),  but  smaller  premise  set  than  above, 
consisting  only  of  that  part  of  At  corresponding  to  J  =  {1,2},  we  know  from  the  transitivity 
property  (number  13,  Table  1),  that  (ale)  is  EPL-deduced  from  {(alb),  (blc) }  and  hence 
limit(  E(P(alc)  I  P(alb),  P(cld)  >  t))  =  1  and  thus  limit(  E(P(aT>  Ic)  I  P(alb),  P(cld)  >  t))  =  0, 

tTi  tti 

showing  invalidity  for  the  same  conclusion  with  the  smaller  premise  class. 

Finally,  among  the  key  open  problems,  mention  should  be  made  of  that  of  determining  bounds 
on  the  differences  between  the  actual  meanconc  functions  and  their  asymptotic  single  rule-plug¬ 
in  replacements.  In  a  similar  direction,  upper  bounds  are  sought  in  the  general  case  -  for  both 
fixed  thresholds  and  unity  limiting  ones  —  for  the  differences  between  meanconc  and  alternative 
deduction  functions,  including  minconc  and  maxent.  While  some  of  the  discrepancies  for  certain 
specific  cases  among  the  three  approaches  (using  meanconc,  minconc,  maxent)  are  pointed  out  in 
Tables  1  and  2,  a  more  general  study  is  needed  to  consider  tradeoffs  between  desirable  deduction 
properties  and  computational  complexity.  It  is  also  of  interest  to  consider  extensions  of  EPL  to  a 
linguistic  setting,  as  a  counterpart  to  the  numerous  fuzzy  logic  approaches  to  reasoning.  An 
outline  for  the  beginning  of  such  an  extension  is  provided  in  Section  3.3  of  [Goodman  & 
Nguyen,  1999b]. 
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Appendix  A.  Proof  of  Theorem  5.6. 

Lemma  A.l.  Make  Basic  Assumption  I,  assume  WHPL  consistency,  and  for  some  K,  0  ^  K 
e  J,  assume  that 

&(b=>a)K&(v(b)K)  =  0.  (A.l) 

Then,  for  any  real  8,  0  <  8  <  l/(2card(J)),  and  any  P  (probability  measure  over  B),  the  following 
statements  are  equivalent: 

(i)  For  each  j  in  J,  either  P(ajlbj)  >1-8  or  P(bj)  =  0. 

(ii)  For  all  j  in  J,  P(bj)  =  0. 

Proof:  Obviously,  (ii)  implies  (i).  Suppose  (i)  holds  for  some  P  such  that  there  is  a  set  L,  0  ^  L 
c:  K  (proper)  such  that  P(ajlbj)  >  1-8,  for  all  j  in  L  and  P(bj)  =  0,  for  all  j  in  K— iL  (^  0).  By  the 
FHH  lower  bound  (eq.(5.22)),  we  must  have  for  PSCEA  P0  counterpart  of  P, 

P0(&0(alb)L)  >  max(Z(P(alb)L)  -  (card(L)-l),  0)  >  max(card(L)-(l-8)  -  (card(L)-l),  0) 


=  max(  1-  S-card(L),  0)  =  1-  S-card(L)  >  Vi  >  0.  (A. 2) 

Then,  eq.(A.2)  combined  with  eq.(4.26)  and  the  monotonicity  property  of  probability  shows  that 

P0(&AC(alb)L)  >  P0(&0(alb)L)  >  0.  (A. 3) 

From  the  structure  of  &Ac  in  eqs.(4.23)  and  (4.27)  combined  with  eq.(A.3), 

P(&(b=>a)L&(v(b)L)  >  0.  (A. 4) 

But,  the  assumption  P(bj)  =  0,  j  in  K— iL,  implies  P(&(b/K-,L))  =  1,  whence  using  eq.(A.4), 

P[&(b^a)L&(v(b)L)&(&(b/K^L))]  >  0.  (A. 5) 

Now,  from  the  definitions  in  eqs.  (4.16)-(4.18), 

&(b^a)L&(v(b)L)&(&(b/L))  =  V  (y(a,  b;  C,  L)  &  (b'  )K^L ) 

(0*CcL) 

=  V  (y(a,b;C, K))  <  V  (y(a,b;C,K))  =  &(b^a)K&(v(b)K).  (A.6) 

(0*CcL)  (0*CcK) 


Thus,  applying  the  monotonicity  of  probability  to  eq.(A.6)  and  combining  with  eq.(A.5)  shows 

P(&(b^a)K&(v(b)K))  >  0, 

contradicting  the  assumption  that  &(b=>a)K&(v(b)K))  =  0.  Hence,  no  such  L  can  exist. 

The  only  remaining  possibilities  are  L  =  K  or  L  =  0.  In  the  case  of  L  =  K,  again  application  of 
the  FHH  lower  bounds  and  again  the  ordering  between  &Ac  and  &0  in  eq.(4.26)  shows 
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P0(&Ac(alb)K)  ^  P0(&o(alb)K)  >  0  , 
which,  analogous  to  eqs.(A.3),  (A.4)  implies 

P(&(b=>a)K&(v(b)K))  >  0, 

once  more,  leading  to  a  contradiction  of  the  assumption.  Hence,  the  only  possibility  is  for  (ii) 
to  hold.  ■ 

Lemma  A.2.  Under  the  same  assumptions  of  Lemma  A.l  and  for  the  K  satisfying  eq.(A.l), 
assume 

&(b=>a)K  <  d=>c.  (A. 7) 

Then,  for  any  real  8,  0  <  8  <  l/(2card(J)),  and  any  P  (probability  measure  over  B)\ 

If  [for  each  j  in  J,  either  P(ajlbj)  >1-8  or  P(bj)  =  0],  then  [P(d=>c)  =  1].  (A. 8) 

Proof:  The  “if’  part  of  eq.(A.8),  certainly  implies  the  condition  holds  for  K  as  a  subset  of  J. 
Then,  applying  Lemma  A.l,  we  must  have  P(bj)  =  0,  all  j  in  K,  i.e., 

P(&(b')K)  =  1.  (A. 9) 


But,  combining  eq.(A.7)  with  a  standard  property  of  the  material  conditional, 

&(b')K  <  &(b=>a)K  <  d^c.  (A.  10) 

Then,  applying  P  throughout  eq.(A.lO)  and  using  eq.(A.9)  shows  the  desired  result.  ■ 

Lemma  A.3.  Make  Basic  Assumption  I,  assume  WHPL  consistency,  and  for  some  K,0^  KcJ, 
assume  that 


and 


0  ^  &(b=>a)K&(v(b)K)  <  c 
&(b=>a)K  <  d=>c. 


(A.ll) 
(A.  12) 


Then,  for  that  K, 

(i)  0O  *  &AC(alb)K  <0  (cld). 

(ii)  For  any  real  8,  0  <  8  <  l/(2card(J))  and  any  P  such  that  for  all  j  in  J,  either  P(ajlbj)  >  1-8  or 
P(bj)  =  0,  then  [P(cld)  >  1  -  S-card(J)]. 

Proof:  First,  the  left-hand  side  of  eq.(A.ll)  shows  that  &AC(alb)K  must  be  a  proper  conditional 
event.  (Again,  see  the  discussion  following  eq.(4.23).)  Using  the  definition  of  &Ac  and 
comparing  the  remaining  part  of  eq.(A.ll)  and  eq.(A.12)  with  the  basic  ordering  criterion  of 
PSCEA  between  proper  conditional  events  given  in  eq.(4.19)  shows  the  validity  of  (i). 

Next,  let  KP  =d  [j  in  K:  P(bj)  =  0}.  Thus,  by  hypothesis,  K— iKP  =  [j  in  K:  P(ajlbj)  >  1-8}, 
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P(v(b)K)  =  P(v(b)K-iKp)>  P(&(b^a)K&(v(b)K))  =  P(&(b=>a)K-,Kp&(v(b)  k.kp)), 
implying 

P(&AC(alb)K)  =  P(&Ac(alb)K.KB).  (A.  13) 


Case  1.  KP=  K.  Then,  by  hypothesis  (A.ll), 

P(c)  >  P(&(b^a)K&(v(b)K)  =  P(&(b')K)  =  1,  implying  P(cld)  =  1. 

Case  2.  0cKPcK  (proper). 

Subcase  1.  P(&(b=>a)K^Kp&(v(b)K^Kp))  =  0.  But,  by  Lemma  A.l,  this  condition  implies  for  all  j 
in  K— iKp  that  P(bj)  =  0,  contradicting  the  very  meaning  of  K— iKp. 

Subcase  2.  P(&(b^a)K^Kp&(v(b)K^Kp))  >  0.  Now,  again  using  the  FHH  lower  bound  in 

eq.(5.22),  as  in  eq.(A.2),  replacing  L  there  by  K-iKp,  combining  with  the  monotonicity  property 
of  P0  applied  to  result  (i)  and  the  order  relation  between  &0  and  &Ac,  and  using  eq.(A.13), 

P(cld)  >  P0(&Ac(alb)K)  =  P0(&Ac(alb)  k^kp)  ^  P0(&0(alb)  k^kp)  ^  1  -  8-card(K^  kp)  ^  1  -  S-card(J), 

the  desired  result  for  (ii).  ■ 

Lemma  A.4.  Make  Basic  Assumption  I  and  assume  (alb)j  is  WHPL  consistent.  Then, 

Assumption  Q  implies  [  (alb)j  <whpl  (cld)  ], 

where 

Assumption  Q  : 

(there  is  some  K,  0  ^  K  c  1)(  [&(b=>a)K&(v(b)K)  <  c  ]  and  [&(b=>a)K  <  d=>c]  ). 

Proof:  Break  up  Assumption  Q  into  two  parts 

Qi:  (there  is  some  K,0^KcJ)(  [&(b=>a)K&(v(b)K)  =  0  ]  and  [&(b=>a)K  <  d=>c]  ), 

Q?:  (there  is  some  K,0^KcJ)(  [0^  &(b=>a)K&(v(b)K)  <  c]  and  [&(b=>a)K  <  d=>c]  ), 
and  apply  Lemma  A. 2  to  Qi  and  Lemma  A. 3  to  Q?.  ■ 

Lemma  A.5.  Make  Basic  Assumption  I  and  assume  (alb)jis  WHPL  consistent.  Then, 

not(Q)  implies  not[(alb)j  <whpl  (cld)]. 

Proof:  First  note  that 

not(Q)  iff  (for  all  K,  0  ^  K  e  J)( /K,i  or  7K, 2  or  /k,3),  (A.  14) 

where 

/k,i  =d  (XK&c'd  *  0  )  ;  7k, 2  =d  (  &(b')K&c'd  *  0  )  ;  7K,3  =d  (XK&d'  *  0  );  (A.  15) 

xK  =d  &(b=>a)K&(v(b)K),  0^  KcJ,  (A.16) 


50 


Next,  proceed  to  construct  mutually  disjoint  “blocks”,  analogous  to  the  procedure  in  the  proof  of 
Theorem  5.1  ((ii)  implies  (iv)),  but  slightly  modified,  beginning  with  index  set  J.  Thus,  we 
obtain  a  nonvacuous  exhaustive  disjoint  partitioning  {Ki,...,Km}  of  J,  for  some  positive  integer 
M,  with  the  same  notation  as  in  eq.(5.15),  such  that  there  is  a  collection  of  mutually  disjoint 
nonvacuous  events  -  which  are  the  blocks  -  given  as 

Y(a,b;Kj,J— iK(j))&rij ,  j  =  1,...,M-1,  (A.17) 

and  where  for  the  first  time,  at  step  M,  by  definition,  either, 

Case  1  y(  a ,  b ;  K M ,J— i  K(m))  &T| M  =  &(aKM)&r|M  is  also  nonvacuous  and  mutually  disjoint  with 
respect  to  events  in  (A.17), 

where 

r|j  in  {c'd,  d'},  j  =  1,...,M;  (A.18) 

or 

Case  2  y(a,b;0,J— iK(M-l))  =  &(b/)j-,K(M-i)  is  also  nonvacuous  and  mutually 

disjoint  with  respect  to  events  in  (A.17),  where  all  y(a,b;L,K)  are  defined  as  usual  as  in  eq.(4.16) 
with  K  replaced  by  L  and  J  by  K,  etc. 

The  procedure  ends  when  either  Case  1  or  Case  2  first  occurs.  (In  the  construction  in  the  proof 
of  Theorem  5.1,  the  conjunctive  factors  T|j  were  missing  and  the  procedure  ended  when  Case  1 
first  occurred  -at  the  Mth  step.)  Next,  analogous  to  the  construction  of  P  in  the  proof  of 
Theorem  5.1  ((iv)  implies  (I)),  assign 

P[y(a,b;Kj,J— iK(j))&rij]  =d  8J  1  -  8s,  for  j  =1,...,M-1; 
P[y(a,b;KM,J-iK(M-l))&r|M]  =d  SM,  if  Case  1  holds; 

P[y(a,b;0,J— iK(M-l))]  =d  8M,  if  Case  2  holds.  (A.  19) 

Then,  analogous  to  the  proof  of  Theorem  5.1  (see  eqs.(5.19)-(5.21)),  it  follows  that 

P(ajlbj)  >  1  -8,  for  all  j  in  J.  (A.20) 

In  addition,  since  only  either  c'd  or  d'  (but  not  both)  appears  explicitly  at  each  blockj,  j=l,...,M, 
one  has  only  the  possibilities: 

Situation  1:  There  is  at  least  some  blockj  in  which  c'd  appears.  Then,  clearly,  since  P  is  unity 
over  the  disjunction  of  all  of  the  blocks, 

P(c'ld)  =  P(c'd)/P(d)  >  £P(blockj)  /  (  £P(block;)  )  =  1.  (A.21) 

c'd  appears  at  block  i)  c'd  appears  at  block  i) 

Hence,  eqs.(A.20)  and  (A.21)  show  that  not[(alb)j  <whpl  (cld)],  due  to  the  arbitariness  of  8. 
Situation  2.  There  is  no  block  in  which  c'd  appears,  i.e.,  only  d'  appears  in  each  of  the  blocks. 
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Thus,  in  this  situation, 


0  ±  Tkj&cT,  i  =  1,...,M, 


(A. 22) 


But,  eq.(A.22)  implies 


0^Xk.,  i=  1,...,M. 


(A. 23) 


Then,  according  to  Theorem  5.1,  eq(A.23)  insures  that  (alb)j  is  SHPL  consistent.  On  the  other 
hand,  eq.(A.22)  and  the  construction  of  P  in  eq.(A.19),  shows,  analgous  to  eq.(A.21)  that 

P(d')  =  1.  (A. 24) 


Also,  eqs.(A.20)  and  (A. 24)  imply  that  not[(alb)j  <shpl  (cld)]  (we  need  P(cld)  positive  and  “high” 
for  SHPL  to  hold).  In  turn,  because  of  SHPL  consistency,  Theorem  5.4  shows  that  not[(alb)j 
—shpl  (cld)]  implies  not[(alb)  <whpl  (cld)].  Hence,  finally,  not[(alb)  <whpl  (cld)]  also  holds  in  this 
situation.  ■ 


Finally: 

Proof  of  Theorem  5.6:  Simply  combine  Lemmas  A. 4  and  A. 5.  ■ 
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